Christmas causes colon cancer? No, it's just correlated.

Christmas Causes Colon Cancer? Google Correlate

A few days ago, Google released a new experimental service called Google Correlate. It is similar to Google Trends in that it analyses the numbers of times search terms are used on Google. The difference, as the name suggests, is that it allows you to find the correlations between different search terms. In simple terms, a correlation is used to show how often a high level of one thing is found at the same time as a high level of another. Correlation doesn’t necessarily mean that one thing causes the other, just that the 2 things are seen together at the same time. So the number of wrinkles counted on the forehead is correlated to the number of heart attacks someone has had. Wrinkles don’t cause heart attacks or vice versa. But there is a correlation between them (one increases as the other increases).

When you do a search on Google Correlate, it returns the search terms that are most highly correlated with the terms you entered. Unfortunately that all too often results in misspellings of those terms being highly correlated with your search such as ‘obama’ being correlated with ‘0bama’ (Obama misspelt with a zero). Unfortunately, Google correlate launched with a feature lacking, which I think would be the most fun: finding the correlation between any 2 terms. I have made a little page that will force google correlate to do this for you.

Click here to use the full features of Google Correlate using my page.

Using this method I came up with some interesting observations. My first search was to look at Easter and Christmas:

Christmas and Easter

Time plot of Christmas and Easter.

You can see as each of these holidays approach, the numbers of searches increases. However they aren’t correlated (r=-0.1292). In fact, if you look below you’ll notice that when people are search for one they are pretty much not searching for the other!

Christmas and Easter Scatter Plot

Christmas and Easter Scatter Plot

I thought that two terms that would be more correlated with each other would be Easter and chocolate. To my surprise, it’s not the case. In fact there’s a bigger correlation between Christmas and Chocolate (see below).

Easter and Chocolate Scatter Plot

Easter and Chocolate Scatter Plot

Chocolate and Christmas Scatter Plot

Chocolate and Christmas Scatter Plot

Another feature of Google Correlate is that it allows you to map out where in the US terms are highly correlated. So it turns out that in states where people are searching for Christmas, they are also searching for signs of colon cancer:

Christmas causes colon cancer? No, it's just correlated.

Christmas causes colon cancer? No, it's just correlated.

Does this means that Christmas causes colon cancer? No Scrooge, it doesn’t. It simply means that those states that search for one search for the other. Maybe it has to do with the average socio-economic background of those states, or religious beliefs (or lack thereof) that lead to a person worrying about what might be growing inside their belly.

The last couple of features I would like to introduce to you are the data upload and draw features. If you have your own raw data, you can upload it to see how it correlates. As most of us don’t have data to upload, it is much more fun to use the draw tool to find correlations. One that I drew randomly was a wave that gets high around 2006, drops down, then goes up again to 2009, then drops again. It turns out that the most highly correlated term is “2006 mercedes benz”. This is also true of “2006 audi” and “2006 jaguar”. It seems that people are more interesting in buying luxury cars about 3 years after they are released, to give them time to depreciate a little.

 

This drawing results in "2006 Mercedes Benz" as a highly correlated search

This drawing results in "2006 Mercedes Benz" as a highly correlated search

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s