My favorite example of correlation not proving causation is that it is not appropriate to say that firemen cause fires because every time there's a fire there are firemen there.
Correlation is expressed as a decimal between -1 and 1.
1 is a perfect positive correlation. An increase in one variable yields a proportionate increase in another one.
-1 is a perfect negative correlation. An increase in one variable yields a proportionate decrease in another one.
0 is no correlation. A change in one variable has no relation to changes in another one.
Tom Naughton, who made the movie Fat Head writes a blog. A recent one discussed a large study called the NIH-AARP Diet and Health Study. A number of correlations between diet and health outcomes were observed with varying levels of correlation. He discusses the significance of the study and correlations and there are some really good comments after the post.
He talks about some of the correlations found in the study, and I realized that I didn't have a really good sense of what these correlations look like, so I generated some.
Here are the correlations assuming underlying uniform distributions between two variables. Note that even a 0.6 correlation has a lot of variability.
Here are the correlations assuming underlying normal distributions between the two variables. These points are more centrally grouped because of the nature of the underlying distributions.
A fun allegory on correlation and causation
Correlation is not causation