The Marshmallow Study revisited

23 Sep 2023 | Evidence is not proof

One of the most famous psychology experiments conducted on children is the Marshmallow Study. In a 1972 paper, Walter Mischel and co-authors gave three-to-five-year-olds at the Stanford Bing Nursery School a marshmallow. They could eat it now, but if they waited 15 minutes, they’d get a second marshmallow.

A series of follow-up studies tracked the kids over time. It found that those who were able to wait for the second marshmallow ended up doing better in life — higher SAT scores, higher educational attainment, lower body mass index, less drug use, fewer mental health issues, slower to frustrate, and many other desirable outcomes. That makes sense — if you can delay gratification, you’ll study harder, exercise more, and resist vices. These studies are seen as hard evidence on the value of self-control, and led to marshmallow tests and self-control techniques being integrated into school curriculums. Sesame Street’s notoriously gluttonous Cookie Monster appeared in a video where he has to wait to eat cookies and learns some strategies to help him stay in control.

But data is not evidence, because it may not be conclusive — there may be rival theories. Waiting for the second marshmallow might not reflect self-control but something else:

  • Affluence. Perhaps poorer kids eat whatever food’s in front of them — and that’s rational because they’re not sure when the next meal will come from. There might be food in the cupboard today, but it might be gone tomorrow so waiting is risky.
  • Family background. Kids with a difficult home life have learned not to trust others — if their parents constantly break promises (either due to untrustworthiness or financial necessity), then they can’t be sure that the experimenter will bring the second marshmallow.

It could be affluence or family background that drove the kids’ future success, not self-control. If so, teaching children to delay gratification without changing these other drivers might make little difference. Indeed, it could backfire, if it teaches kids to be naively trusting when they need to grab what they can, or diverts resources away from other initiatives targeted at the real causes, such as free school meals.

In theory, you’d solve the problem by controlling for affluence and family background. But in practice, the Stanford researchers couldn’t. The experiments were on children of Stanford faculty and staff, so there were few differences in wealth and upbringing that could be controlled for.

So a separate set of researchers took a much bigger dataset (900 rather than 90 kids), and crucially one containing a variety of backgrounds: over half had mothers who hadn’t completed college. For these kids, the link between waiting for the second marshmallow and academic outcomes at age 15 was only half as strong as in the original studies — and it fell by two thirds (to a sixth of the original size) after controlling for family background, early cognitive ability and the home environment. Moreover, there was no correlation — even without controls — with behavioural outcomes at age 15.

This is an example of why evidence is not proof: it may not be universal. What you learn from 90 kids of Stanford faculty and staff might not apply to other children. This warns us about taking the findings from one specific study and using them to influence educational policy across the country as a whole.

