The Marshmallow Study revisited

23 Sep 2023 | Evidence is not proof

One of the most famous psychology experiments conducted on children is the Marshmallow Study. In a 1972 paper, Walter Mischel and co-authors gave three-to-five-year-olds at the Stanford Bing Nursery School a marshmallow. They could eat it now, but if they waited 15 minutes, they’d get a second marshmallow.

A series of follow-up studies tracked the kids over time. It found that those who were able to wait for the second marshmallow ended up doing better in life — higher SAT scores, higher educational attainment, lower body mass index, less drug use, fewer mental health issues, slower to frustrate, and many other desirable outcomes. That makes sense — if you can delay gratification, you’ll study harder, exercise more, and resist vices. These studies are seen as hard evidence on the value of self-control, and led to marshmallow tests and self-control techniques being integrated into school curriculums. Sesame Street’s notoriously gluttonous Cookie Monster appeared in a video where he has to wait to eat cookies and learns some strategies to help him stay in control.

But data is not evidence, because it may not be conclusive — there may be rival theories. Waiting for the second marshmallow might not reflect self-control but something else:

  • Affluence. Perhaps poorer kids eat whatever food’s in front of them — and that’s rational because they’re not sure when the next meal will come from. There might be food in the cupboard today, but it might be gone tomorrow so waiting is risky.
  • Family background. Kids with a difficult home life have learned not to trust others — if their parents constantly break promises (either due to untrustworthiness or financial necessity), then they can’t be sure that the experimenter will bring the second marshmallow.

It could be affluence or family background that drove the kids’ future success, not self-control. If so, teaching children to delay gratification without changing these other drivers might make little difference. Indeed, it could backfire, if it teaches kids to be naively trusting when they need to grab what they can, or diverts resources away from other initiatives targeted at the real causes, such as free school meals.

In theory, you’d solve the problem by controlling for affluence and family background. But in practice, the Stanford researchers couldn’t. The experiments were on children of Stanford faculty and staff, so there were few differences in wealth and upbringing that could be controlled for.

So a separate set of researchers took a much bigger dataset (900 rather than 90 kids), and crucially one containing a variety of backgrounds: over half had mothers who hadn’t completed college. For these kids, the link between waiting for the second marshmallow and academic outcomes at age 15 was only half as strong as in the original studies — and it fell by two thirds (to a sixth of the original size) after controlling for family background, early cognitive ability and the home environment. Moreover, there was no correlation — even without controls — with behavioural outcomes at age 15.

This is an example of why evidence is not proof: it may not be universal. What you learn from 90 kids of Stanford faculty and staff might not apply to other children. This warns us about taking the findings from one specific study and using them to influence educational policy across the country as a whole.

Inside the Ivory Tower

Inside the Ivory Tower

In May Contain Lies, I highlight the value of academic research. While it's far from perfect, it can be more reliable than practitioner studies for a number of reasons: Its goal is scientific inquiry, rather than advocacy of a pre-existing position or releasing findings to improve a company's image. It's conducted by those with expertise in conducting scientific research. Papers published in top scientific journals are peer-reviewed, which helpsimprove their accuracy. However, authors, journalists, and practitioners will sometimes cite research as if it bears the hallmark ...
Does only 2% of VC funding go to female founders?

Does only 2% of VC funding go to female founders?

A widely quoted statistic is that only 2% of VC funding goes to female founders. For example, this Forbes article highlights that "only 2% of all VC funding goes to women-led startups" and asks "Why is only 2% of VC funding going to female founders"? If true, this statistic is substantial underrepresentation and needs to be urgently addressed. However, it's problematic for several reasons. 1. The Statistic Ignores Diverse Teams The 2% statistic actually refers to companies founded solely by women. It ignores diverse companies founded by both men and women. This is strange, because ...
An unhealthy obsession with organisational health

An unhealthy obsession with organisational health

Two leading asset management firms drew my attention to the McKinsey Organizational Health Index as a potential tool to evaluate a company. A book, "Beyond Performance 2.0: A Proven Approach to Leading Large-Scale Change", written by two McKinsey partners, claimed that companies with high scores on this Index trounced their unhealthy peers along a range of performance measures. For example, their shareholder returns were three times as high. But as I wrote in an earlier post, rather than being more impressed by big numbers, we should be more sceptical. If it were really possible to ...