Looking at study results with a critical eye


As a physician you are the embodiment of delayed gratification. You spent more than 20 years in school before you earned a degree that then allowed you spend another 3-plus years in training before anyone would consider you a “real” doctor. Somewhere along that long and shallow trajectory someone may have said, “You must have done really well on the marshmallow test.”

Dr. William G. Wilkoff practiced primary care pediatrics in Brunswick, Maine, for nearly 40 years.

Dr. William G. Wilkoff

Described first in 1990 by Shoda et al. in the journal Developmental Psychology, the marshmallow test found that children who could wait longer for a reward (in this case a marshmallow) had higher SAT scores as teenagers. (Dev Psychol. 1990 Nov;26[6]:978-86). In large part because the results of the study feel intuitive and square with our sense of fairness, the marshmallow test has become one of the unquestioned cornerstones of developmental psychology.

That is, until this year, when an attempt to replicate the initial study by Shoda et al. failed to find that the associations between delayed gratification and adolescent achievement were anywhere near as significant as those reported in the 1990 study (Psychol Sci. 2018 May. doi: 10.1177/0956797618761661). Watts et al. suggest that the discrepancy may be explained in part by a failure to adequately control for family background, home environment, and early cognitive ability in the initial experimental design.

Is there a message here? Should we stop wasting our time reading papers from the developmental psychology literature? Not just yet. There are more papers coming out in which the authors attempt to replicate other landmark studies, often without success (“Undergrads Can Improve Psychology,” by Russel T. Warne and Jordan Wagge, The Wall Street Journal, June 20, 2018). Let’s wait and see how much more debunking there is going to be before we throw the baby out with the bath water.

The real message is that every study we encounter should be read with a critical eye regardless of how prestigious the institution of origin and regardless of how much it appeals to our common sense. Our intuition can be a powerful tool when we are looking for answers, but it can lead us astray if we take it too seriously.

It is often said that a good experiment is one that raises more questions than it answers. You don’t have to remember all that stuff you learned when you studied statistics to be able to question the results of a study you read in a peer-reviewed journal. I find that in many of the papers I read I have serious concerns about how well the authors have controlled for the not-so-obvious variables.

So where does this failed attempt at replicating the original marshmallow test study leave us? It is still very likely given your aptitude for delayed gratification that had you been given the test as a preschooler you would not have even touched the marshmallow until the experimenter re-entered the room to end the test and then ... you probably would have offered to share it with her.

