30 December 2009

Interpreting Science Writing

Note to readers: This column is coming out much later than I had intended. I apologize for that. I think I may have blown my resolution from earlier this year to put out one post per month. I really struggled with this one. As I wrote, the article kept getting longer and more detailed. There is no shortage of bad science coverage out there. I was stalled and my wife helped me break out of it (thanks). Ultimately, I probably should treat this as an overview post, then put in detailed posts in the future on some of the individual points.

So please let me know in the comments if you are interested in reading more detail about any of the points made in this post. --e4e

Writing about science writing is like writing about vocabulary or word usage. There is a really good chance that you will find inconsistencies in some of your own practices. Shame and ridicule have never stopped me though, so here goes.

Sensational headlines call to us to stop eating saturated fat. Or to stay out of the sun. Do more cardio (or less). How can we make sense of this hodge-podge?

First, research has a well-defined and specific structure and process. The researcher states a null hypothesis, e.g. saturated fat does not cause heart disease. Then they determine a protocol for testing that hypothesis. Run the research and analyze the results. In standard scientific method, the results do not prove the hypothesis--they either reject it or fail to reject it. This is similar to a court case, where a defendant is found guilty or not guilty. A "Not guilty" verdict does not prove innocence, only that there was not sufficient evidence to prove guilt.

Repeating studies and testing variations of the hypothesis are an important part of this as well. For example, if the researchers use palmitic acid (a saturated fat) in the original protocol, someone else might try a similar experiment with lauric acid (another saturated fat) to see if the result is generalizable to another saturated fat. Or perhaps they would try a similar test with fewer refined carbohydrates in the diet to test a suspected interaction.

In the final analysis, only when a hypothesis has been tested multiple times with valid well-designed studies, along with reasonable variations, and has been peer-reviewed, can we begin to say that the hypothesis or its evolved state is very likely. This is even more true in human studies, where numerous factors can confound results.

Sadly, in the current state of our society and even perhaps in research communities, we take as given, hypotheses that are much less well-founded. A typical flow is more like this. A researcher analyzes a data set and looks for relationships in the data. If he or she finds one, he builds a hypothesis that rationalizes what he thinks he sees in the full data set, or even worse, a filtered or smoothed one. So far, it's not too bad. If he then takes his analysis and designs a good intervention study he may be on to something.

Unfortunately, too often, a science writer gets wind of his hypothesis (the researchers are not innocent in this), and publishes a sensational story before the real work begins.

As always, you are your own n = 1. Read carefully, learn voraciously, and caveat emptor.

Following are some considerations to think about when reaing and interpreting science writing.

1. Many, or perhaps most, mainstream science writers are much stronger at writing than science. They do not have the ability, inclination, or motivation to discriminate between important and unimportant results. Question mainstream media (MSM) reports of science. Gary Taubes is seen by many as one of the better science writers around. Who are some others?

2. Correlation does not imply causation, however, lack of correlation does imply lack of causation - If you are reading about an epidemiological (observational) study it proves nothing. Epidemiology is the study of factors affecting the health and illness of populations. Not long ago they tracked thousands of people over 10 years based on what they reported eating, then said that eating more red meat was "linked to early death." That's fine as far as it goes, but then the reporting (and the "scientists") took the next step and said that therefore, eating meat causes cancer. That's not science.

In a similar vein, beware the expression, "...is a risk factor for...", e.g. "Obesity is a risk factor for diabetes." When I read that I interpret causality, as in obesity causes diabetes. The truth of the matter is more like, "people who have diabetes tend also to be obese, both conditions are related to a insulin-resistant metabolic defect."

The epidemiological studies can point you in the direction of something to test further, i.e. they can generate a hypothesis, but it can not establish causality. A great comment by "seyont" from Dr. Eades' blog. "An observational study is sort of a triple-blind study. The researchers do not know what experiment was performed, on whom it was performed, or even if it was performed. They grab a bunch of people and dream up an experiment which could plausibly have produced the correlations they see."

3. Mouse studies are similar. They can point you in a direction for further study, but are not conclusive. I'm sure you get that mice are not people (although some people are rats). Here's an example and an alternate hypothesis.

4. Look out for the word "adjusted." There are always confounding factors in studies that researchers want to adjust for. It is impossible to adjust accurately and there is always slop in it. If there is an adjustment factor applied there is doubt in the conclusions. This is especially an issue when the size of the adjustments (the noise) is of a similar order of magnitude to the quantity being measured (the signal). It doesn't mean the conclusions are wrong necessarily, but the basis of the adjustments has to be really strong for the conclusions to be valid.
5. Watch out for weasel words like "probably," "potentially," and "may." It doesn't mean that the conclusions are necessarily wrong, but often those words are inserted to make the sensational headlines true. Brad Pilon of Eat Stop Eat fame talks about an article with plenty of weasel words here.

6. Details are important and usually not covered in a 1,000 word article.

7. Randomized double-blind placebo controlled intervention studies are the gold-standard for nutrition research. Unfortunately they can be extremely expensive, especially for long term effects.

8. Follow the money - Look to see who paid for the research. If the results fit their commercial or social agenda, doubt the conclusions. There is a very strong bias in research publications that results that do not support the sponsors agenda do not get published or get spun. This article hammers the statin people for treating their agenda as science.

9. Focus on what you care about. Many studies support a very narrow agenda. For example, the American dermatologists association says to always protect your skin from the sun with sun screen. Their agenda is to prevent skin cancer. However, lack of sun can cause lack of vitamin D, which is associated with (causality?) higher incidence of other cancers, flu, and colds.

10. Question results that are not consistent with how mankind has lived for two million years. Using sun exposure as an example again, our bodies are finely tuned to an ancient lifestyle. It seems crazy in that framework to think that exposure to "normal" amounts of sun would be overall detrimental to health. Your mileage may vary if you are very fair for example, but keep things in perspective.

11. Beware statistical significance. Statistical significance is a term that tells how sure you are that a difference or relationship exists. If a relationship between to variables is found to be "statistically significant", it means that it is unlikely that the observed relationship is random. The relationship may be small or meaningless, but they are highly likely (generally 95%) to be linked.

12. Paradoxes aren't paradoxes. They are information that disproves the hypothesis of some preconceived notion or prejudice. Perhaps the most famous paradox is the "French Paradox." the French eat more saturated fat, yet have better heart health than Americans. Lots of hypotheses have been generated to explain this (more smoking, more red wine, etc). Yeah, that's it, that's the ticket. Drinking and smoking actually protect you from heart disease. Oh, but not too much. That will kill you. Dave at Spark of Reason wrote this article on paradoxes. Tom Naughton of Fat Head fame had this to say.

Let's apply Occam's Razor. There is a much simpler conclusion--the original hypothesis is wrong: saturated fat does not cause heart disease. From Dave at Spark of Reason: "A true paradox would indicate inconsistency in the rules and assumptions used to build the system."

13. Many intervention studies do not test the reported factor, but rather something entirely different. As an example, if you want to test the impact of a low carbohydrate diet on lipid profiles, there should be control of the actual ingestion of carbohydrates. However, some studies are actually designed to test compliance to that type of diet as well as the impact. It then becomes difficult to draw conclusions about the physical effect of the diet.

14. Statements of certainty, e.g. "the science is settled" or "it is universally accepted that..." can be used to bully people into alignment. It takes a gutsy and very confident person to argue against the "consensus." This doesn't mean that the predominant view is wrong, just raises a red flag to the possibility that people are trying to cut off additional inquiry.

15. It is possible that most published research is wrong. Bayesian analysis of false positives and negatives, combined with human nature and biases conspire to place a scary-high likelihood that any given research results are incorrect. Here's the original paper for anyone interested.

16. Look at the logic. If the author is simply focusing on the qualifications (or lack thereof) of someone. It may indicate that the substance is weak.

Some Examples of Analysis of Science Writing

1. This article, "A High Fat Diet During Pregnancy Can Lead To Severe Liver Disease in Offspring," as skewered here by Chris Masterjohn is an excellent example of several of the points above. First, the article talks about the mother or the woman or the child in several places, the headline talks about high fat, while the researchers talk about high saturated fat. Incredibly, they fail to mention that the "woman and mother" in question is a mouse. Oh and by the way, that low fat control diet was higher in saturated fat as a percentage than the high fat diet.

From Masterjohn:
"...less than seven percent of the calories from the 'unhealthy saturated-fat-enriched diet' actually came from saturated fat.

"The "unhealthy saturated fat-enriched diet" actually contained 44 percent of its fat as polyunsaturated fatty acids (PUFA) and almost twenty percent of its total calories as PUFA. This is in great excess of the PUFA consumption seen even in the Standard American Diet (SAD), loaded in processed PUFA-rich vegetable oils."

2. Design of experiments, Hypothesis Testing and Bayesian Inference are critical aspects of understanding the validity of ideas. It seems that many researchers are not skilled in the application of those aspects, especially in nutrition and exercise. We're left then with unskilled reporters and broscience.

One of the more famous occurred some years ago when an epidemiological studies demonstrated a correlation between hormone replacement therapy (HRT) in women and better cancer rates. For years after that millions of women were put on HRT, post-menopause. Only recently did they actually perform a placebo-controlled study and find that (oops), HRT actually increases breast cancer and heart attack and can lead to early dementia. It can also have some positive effects such as decrease in hip fracture and colorectal cancer. It seems that in the early epidemiological studies, the women on HRT tended to be more health conscious in general. This then masked the negative impact of the HRT.

But, here are some of the details in this. Even with the good study, the timing of starting the HRT made a difference, and it showed age variation in the impacts. Also, it could be that it was an indictment of the specific hormones used in the study and not HRT in general.

3. Another significant epidemiological study that is looking worse every day is the Ancel Key's Seven Nations Study. This may be of the most devastatingly believed studies in the history of the world. In a nutshell, Keys plotted cardiovascular disease against animal fat consumption in seven countries and found a very high correlation. However, he omitted 14 countries with data that did not support his hypothesis. This in turn led directly to higher sugar and carbohydrate intake and to reduced fat consumption. Some believe that this is one of the main causes of the obesity and cardiovascular disease epidemic in the world today.

4. Read Gary Taubes' article on epidemiology or better yet, buy Good Calories, Bad Calories

5. Here's another Taubes interview on Good and Bad Science

6. Michael Eades on epidemiological research
7. Staggeringly good 4-part example (1), (2), (3), (4) of analyzing a study and generating an alternate hypothesis from Whole Health Source
8. Interesting perspective on good science from OvercomingBias

A quote from one of the commenters, daublin

"There is a lot of confusion in both the cited article and in this article about what science is. For sure, it’s not the “scientific method”. That doesn’t mean science doesn’t happen, but that it’s being misdescribed. Here’s what science means, in the tradition descending from the Enlightenment. Science is study with the following properties:

"1. It’s about objective claims. There is no place in science for claims that different observers will, by definition, never agree on.

"2. It’s about falsifiable claims. There is no place in science for claims that could never possibly be decisively proven false.

"3. The evidence must be repeatable by other scientists. In particular, experiments must be communicated in enough detail that other scientists can repeat the experiment so as to verify the result.

"4. It follows Occam’s Razor. Simple theories are better than complex ones."


  1. I'd like more, please.

    My view is that the woes of "science" can be traced to the idea that there is any sort of objective truth. The best we can ever do is establish "truth beyond reasonable doubt", so to speak. For instance, our experience is that objects fall toward the Earth's center. We can't prove that this will always happen, but we can draw on many "experiments" to bolster our expectations, i.e. it is highly unlikely that we will never see an object "fall up".

    The mind, unfortunately, gravitates toward absolutes. We "choose" (consciously or not) to judge hypotheses as absolutely true or false, and this combined with a sort of "herd mentality" leads to information cascades, memes, etc. Whenever someone says "we know that hypothesis X is true", what they're really saying is that they chose to believe hypothesis X absolutely.

    By contrast, Bayesian Inference allows no such choices. Belief in a hypothesis is simply a number generated by the available information. Two people with the same information will arrive at the same level of belief. As new information arrives, those beliefs are updated. The attributes of falsifiability, Occam's razor, etc. are all implicit with this approach. Weak evidence (e.g. epidemiological) can only move beliefs slightly; strong evidence (e.g. double-blind placebo controlled studies) have larger effect. In no case, however, can we ever arrive at absolute truth/falsehood, since this implies that we know absolutely all information which could affect our belief in a hypothesis.

  2. Thanks Dave. This looks like a vote for more on Bayesian analysis. That should be simple...

    I am planning to dig more into correlation and causation as well.


  3. Richard at Free the Animal skewers some bad science writing.