Sex And Lies! The Iffy Science Of Measuring Calories

Sex And Lies! The Iffy Science Of Measuring Calories

by Trevor Butterworth

As you may have heard, sex doesn’t burn nearly as many calories as you might have been led to believe. But this is far from the only finding in obesity research that wilts under intense scrutiny, as the rest of this paper in the New England Journal of Medicine revealed. Each piece of received wisdom about weight-loss and dieting the study took on (eat fruits and vegetables! eat breakfast! etc.) — was found wanting. Conclusions: “False and scientifically unsupported beliefs about obesity are pervasive in both scientific literature and the popular press.” What we think of as hard science can, it turns out, be pretty soft.

One example as to why this is: Measuring the energy expended during sex turns out to be a Himalayan ascent of such ferocious complexity that even a seasoned alpinist might be reduced to wild surmise. As the study’s lead author, David Allison, who is the director of the Nutrition Obesity Research Center explained, “You don’t need to know anything about metabolism to know that someone would have to measure people having sex. And knowing that you’d say, ‘well, it’s probably not the same for a 20-year-old man versus an 80-year-old man; and it might not be the same for a man or a woman; and it might not be the same if it’s the same person you’ve been married to for 20 years or a brand new person.’ So all these factors may come into play and, of course, there may be quite individual variability from person to person.”

A study aiming to figure out the energy expended during sex would, therefore, need to have a large sample of people, representing all ages and both sexes — new lovers, people bored with their partners — just to capture a representative range of all possible energy expenditures. And once this sea of flesh was found, you’d have to measure the amount of oxygen they consume and the amount of C02 they expire, which requires a filtered, airtight room capable of monitoring these changes. Think about what doing that study would involve: hundreds of people willing to get it on in a submarine-like room in a university lab being monitored by academics. And then there are all the parameters: who and what goes where and when?

“Who’s going to volunteer for this study?” asked Allison. “And if they know they’re being watched by some investigators,” he added, “might this affect performance?” Ack; it might indeed: does inhibition subdue the heart or does voyeurism quicken it? A study that accounted for all this variability and answered all these questions with sufficient scientific rigor would surely be legendary, notorious, epic; but Allison and his coauthors could find nothing in the medical literature.

Still, where there’s a will, there’s usually a second best way of measuring something. In this case, you can put on a mask with a breathing tube attached to a metabolic cart, and have your oxygen intake measured during activity. This presents researchers and putative lovers with another potentially confounding problem: “Is someone having sex with a rubber face mask and a tube in their mouth the same as sex in ordinary circumstances?” asks Allison. “And again, who would volunteer for that?”

“Who’s going to volunteer for this study?” asked Allison. “And if they know they’re being watched by some investigators,” he added, “might this affect performance?”

It turns out that in 1984 ten married couples volunteered for precisely this kind of study, although only the energy expended by the men was measured. The study was trying to assess the cardiac demands of sex against a background where men who had recovered from heart attacks were reluctant to exert themselves in bed in case they killed themselves. The 10 male volunteers ranged in age from 25 to 43 and — worth noting — the average physical fitness was high. Small though the sample size was, the researchers were, gratifyingly, thorough. They began by measuring foreplay, although they conceded that the breathing mask made oral stimulation challenging, and the electrodes and blood pressure cuffs constrained movement; nevertheless, there was a small but statistically significant increase in heart rate by four to eight beats per minute. The biggest bang, so to speak, in terms of energy expenditure, was when the man was on top of the woman, although it varied; some men had more energetic orgasms than others.

Even here, as Steven Heymsfield, one of Allison’s co-authors, explains, the data they needed required translation. As best as can be determined, the average bonk burns no more energy than you’d expend walking briskly for about four minutes. This contrasts with the 30 minutes to an hour’s brisk walking everyone had assumed (21 versus 150–300 kilocalories).

That’s quite a come down.


Naturally, the finding grabbed the media by its news instincts. Somewhat lost in the excitement was the bigger picture: if measuring the energy expended in just six minutes was this difficult (yes, that was about the average amount of time the couples spent getting it on), how would you scale a lifetime of dynamic metabolic change? How would you measure the progress of obesity-related disease through thousands of constant — but not necessarily consistent — energy inputs and outputs and then adjust for genetic inheritance and age? Sex, says, Allison was simply an “intellectually useful” way to get people to think about the kinds of claims we just assume to be true, because we assume they have been measured accurately; there were bigger and more intellectually important targets to question: there was, for instance, breakfast.

Now eating breakfast has been shown to provide clear cognitive benefits for school children; but skipping it did not, Allison said, appear to leave you at the mercy of weight gain from over compensating with extra calories later in the day. Or rather, there was no solid evidence either way; it was just one of those things that everybody presumed was true without asking whether it had been shown to be true.

There was breastfeeding. Again, there is nothing wrong with breastfeeding — as, Allison is quick to emphasize, it’s highly recommended. But according to the best-conducted and most careful studies we have, it’s just not protective against obesity, as people have claimed for almost a century. Fruits and vegetables? Great — please eat; but don’t do so under the apprehension that they will produce weight loss in the absence of any other changes in diet and behavior. Snacking? Not quite the sneaky culprit of weight gain we have been led to believe — at least according to the data we have. Setting reasonable goals for weight loss? Meh — you might as well try to lose as much as possible as quickly as possible because it could work for you. On it goes, and due to the editorial limitations of the New England Journal of Medicine, they’ve started with the highlight reel; a much longer paper will be published later in the year.

Even so, the scale of the debunking led some to speculate that a conspiracy might be afoot, given Allison’s extensive previous funding from the food industry (his nutrition research center at the University of Alabama also, it should be noted, receives extensive funding from the National Institutes of Health). “It raises questions about what the purpose of this paper is,” Marion Nestle, a professor of nutrition and food studies at New York University and a highly vocal critic of the food industry and industry-funded science, told the Associated Press. On NBC, she said, “I can’t understand the point of the paper unless it’s to say that the only things that work are drugs, bariatric surgery, and meal replacements, all of which are made by companies with financial ties to the authors.” A mostly conjectural discussion about money, motive, bias, and why the paper’s authors might have covered some myths and not others followed on Health News Review.

“The point of our article was that evidence matters,” Allison told me. “If someone disagrees with one or more of our conclusions, let them state the evidence that supports their disagreement. In science, three things matter: The data; the methods by which the data were collected, which define them and provide their validity or ‘knowledge value;’ and the logic by which the data are connected to conclusions.”

In contrast to Nestle, the dean of evidence-based debunkers, John Ioannidis, who is a professor of medicine at Stanford University’s Prevention Research Center, concurs with Allison. “This is an excellent article that highlights some of the many myths of obesity research,” he said via email. “Faulty health claims are often made with little or no evidence, or with data that are clearly biased. Then these claims are propagated by a conglomerate of expert opinion, distorted media coverage, and inertia.”

One of the particular problems with obesity research, he continues, is that when you compare the field, say, with drug research, there are far fewer randomized trials on prevention and public health. “We should aim at attaining more rigorous evidence and more realistic estimates about how much preventive interventions work,” he said. “This would require sponsoring a sufficient number of large-scale preventive, public health-oriented trials. With few, small trials the evidence remains potentially fragmented and prone to exaggerated and inflated results. Observational studies are useful, but over-reliance on them can cause a credibility crisis in public health research.”


Of course, weak measurement isn’t just a problem for academic credibility, it has a direct bearing on our health. “Think about the first time you got on an airplane,” said Peter Attia, “how much rigorous science had worked out the nuances of aerodynamics? Or when you first took an antibiotic for a certain type of infection, what level of evidence was necessary to let you and your physician feel comfortable about that being a reasonable risk/reward scenario? And yet, when your physician or your nutritionist tells you to eat ‘that’ but not ‘this,’ how much evidence is supporting the recommendation?”

Attia, a former physician and consultant for McKinsey & Company, recently co-founded the nonprofit Nutrition Science Initiative with science writer Gary Taubes. Their shared concern that dietary claims were lacking the kind of scientific rigor expected in other areas led them to seek philanthropic funding to support teams of top researchers to try and produce better data. But their project also reflects a frustration with how science is currently funded. “We spent nearly $5 billion to build CERN, and several billion dollars a year running CERN,” said Attia, “but we can’t spend 50 million dollars a year doing best-in-class science on the single most important question that impacts human health right now?”

But if you keep foisting badly measured policy experimentalism on the public enough times in the name of solving the obesity crisis, all the while claiming that science shows that this will work, then the public will simply stop trusting science and go on eating based on a much more internally reliable measure: pleasure.

But even 50 million dollars pales against the amount of money a pharmaceutical company will put into one drug, much of it directed at massive experiments to find effects. This is one of the issues that troubles Heymsfield, who as executive director of the Pennington Biomedical Research Center, heads the largest academic-based nutritional center in the world. Even the best evidence in obesity research — the randomized control trials — mostly have tiny samples and short time frames. Given how difficult it is to show positive effects in drug trials that cost hundreds of millions of dollars and involve following thousands of people over several years — and how skeptical we often are about the validity of those results — we show, he said, little comparable skepticism about the very small effects found in nutrition studies involving hundreds of people over several months.

But while skepticism may be a cardinal virtue in our information society, it represents a particular problem for a public health community searching for solutions to obesity. If a problem has been poorly measured then it stands to reason that any given solution to that problem may not be a solution at all. But if you keep foisting badly measured policy experimentalism on the public enough times in the name of solving the obesity crisis, all the while claiming that science shows that this will work, then the public will simply stop trusting science and go on eating based on a much more internally reliable measure: pleasure.


As with any field, public health has a radical wing who want to push past limited evidence and say that science has provided enough of a rationale to take action against the enemies of health (in obesity, as we’ve discussed before, the current enemy is sugar). To the hardcore evidence-based wing, all this is just bad — or at best, uncertain — science with an unhealthy dose of media-proselytizing to give it the certainty required to become policy. The enmity between the two groups is often palpable at scientific meetings, bordering on a clash of incommensurable values: the need for action now versus the need to do the best possible science; “we’ve got to do something!” versus “but what if that something is wrong?”

But there is a revolution in measurement coming to public health. Heymsfield sees scientists with more quantitative backgrounds and skills entering the obesity research field and trying to come up with ways of measuring what has not been measured well or at all. One of these is Kevin Hall. Trained as a biophysicist, he now runs a biological modeling lab at the National Institute of Diabetes and Digestive and Kidney Diseases. As he sees it, “the field of nutrition and metabolism is actually quite quantitative, when you compare it to other fields, like immunology — at least at the physiological level.” The problems, he explains, are getting accurate data and then integrating that data in a mechanistic way.

Hall was responsible for filling in the crucial measurements that elucidated one of the most widespread myths highlighted by Allison et al.: the idea that small, consistent changes in energy intake or expenditure will, over time, lead to large changes in weight. The assumption appears to have been based on the 1958 calculation by Max Wishnofsky that one pound of body fat gained or lost is equal to 3,500 kilocalories. This seemed to give people a convenient way to estimate weight loss through diet or exercise, while promising extremely convenient results. If you simply knocked off a 100 kilocalories from your energy intake each day — a ten-minute jog, or a mile walk — you’d end up losing over 50 pounds in five years. Little wonder that early proposals for soda and fat taxes promised to save Americans from themselves: pay a little more, consume a little less, watch a lot of weight disappear in a few years.

Hall first heard the claim listening to a dietician make a calculation for an obese patient. His intuition told him that this calculation was incorrect and would lead to exaggerated weight loss predictions. When he asked for a reference, he was pointed to a nutrition and dietetics textbook. “I subsequently found the mistake everywhere I looked.” People weren’t stopping to think “about the dynamic interaction between energy intake and expenditure, which is complicated,” he says. What they failed to take into account was that “the rate of weight loss changes over time and is primarily determined by the imbalance between energy intake and expenditure — a value that also changes over time.” To radically simplify his model, this means that cutting calories in your diet leads to a decreasing calorie expenditure, which in turn slows weight loss until weight eventually plateaus after a few years. “Of course,” says Hall, “cheating on your diet will cause your weight to plateau much sooner.” In the case of soda taxes, Hall and researchers at the US Department of Agriculture showed how static modeling overstated weight loss by 346 percent after five years.

“We know the kind of mathematics to apply,” says Hall, lightheartedly chiding dieticians for not getting to grips with calculus. But the more serious problem is finding the best data to set the parameters for an accurate and precise model. And when you move up from controlled feeding experiments that define these physiological parameters to the population at large, things rapidly go downhill. “We still don’t have good ways of measuring what people eat when they are outside a metabolic ward,” said Hall. “And until we solve that problem were going to be at an impasse. If we can’t measure what people eat, then how are we going to understand what the relationship between diet and health actually is?”

What we need are quantitative theories, he says, and then to take a group of people, manipulate their food environment in a controlled manner, where we can accurately measure food intake behavior for a period of months or years, collect the data and then see if the theories match up with the experiments. Right now, it seems that we’d rather implement poorly controlled experiments on the population as a whole instead of funding actual experiments to see what happens to people’s behavior and energy balance.

“Until we solve this problem,” he says, “all bets are off in making strong claims about the public health consequences of diet intervention.”

Previously: The Sugar Wars: Science’s Fierce, Geeky Debate Over Soda

Trevor Butterworth is a regular contributor on science and technology for Newsweek. Photo courtesy of James Vaughan.