correlation vs causation
π¦π¦ Did you know that data from the US shows the number of ice cream sales per month is correlated strongly to the number of shark attacks per month?
π±ββοΈπ¨βπAlso, did you know that the number of movies starring Margot Robbie correlates with the number of firefighters in South Dakota?
These are two examples of where people get confused between correlation and causation. It is important to understand both terms and how they can be helpful when making better decisions.
Correlation is a statistical technique that tells us how strongly a pair of variables interact with one another.
Causation goes a step further than correlation in that it looks to see if you change one of the variables, will that cause a change in the other. It is also known as 'cause and effect'.
Ice cream sales up and shark attacks up. Two variables that correlate. Are ice cream sales causing more shark attacks? No. Correlation does not mean causation. The correlation is merely a coincidence. In this example, it is likely that the unknown factor is the weather. So when the temperature goes up, more people buy ice cream. Also, when the weather is more pleasant, more people are likely to go for a swim in the sea and be subjected to unwanted advances by a shark.
How do we mitigate?
1. Think critically about the information you are analyzing and the link between the variables you are reviewing.
2. Seek to qualify the data where possible.
3. Ask yourself if there is another variable that can be influencing the study.
In decision science, it is important to rely on data to make more informed decisions. The drawbacks of this is that we can be easily tricked into correlation/causation.
If you are interested in more spurious correlations, check out tylervigen.com