We all know that correlation is not causation. Causation arrows are not always clear, and sometimes two factors that correlate to each other are actually caused by a third factor.
My favorite example of correlation not proving causation is that it is not appropriate to say that firemen cause fires because every time there's a fire there are firemen there.
Correlation is expressed as a decimal between -1 and 1.
1 is a perfect positive correlation. An increase in one variable yields a
proportionate increase in another one.
-1 is a perfect negative correlation. An increase in one variable yields a
proportionate decrease in
another one.
0 is no correlation. A change in one variable has no relation to changes in another one.
Tom
Naughton, who made the movie
Fat Head writes a blog. A
recent one discussed a large study called the NIH-
AARP Diet and Health Study. A number of
correlations between diet and health outcomes were observed with varying levels of correlation. He discusses the
significance of the study and
correlations and there are some really good comments after the post.
He talks about some of the
correlations found in the study, and I realized that I didn't have a really good sense of what these
correlations look like, so I generated some.
Here are the
correlations assuming underlying uniform
distributions between two variables. Note that even a 0.6 correlation has a lot of variability.
Here are the
correlations assuming underlying normal
distributions between the two variables. These points are more centrally grouped because of the nature of the underlying
distributions.
A fun allegory on correlation and causation