Saturday, January 23, 2016

Trending Data

I have known about all of these trending search engines and thought they were quaint but recently I have actually seen some examples of uses that make me believe they maybe worth more and worth talking about in an senior Data Management class. For example I saw this one from @NateSilver538
Another example is from the Science Friday Podcast talking about tracking "hate" through Google searches. Listen below:
The trending site used in both of those cases was Google Trends and has been around for a while. Basically you put in the search terms you wish to compare and it shows how often they were searched on Google. For example the Superbowl is coming up in a couple of weeks so if you search "Superbowl", it shouldn't be surprising that we get a periodic pattern:

Once you have one search term, you can add others. For example, let's see how popular Christmas is compared to the Superbowl:

Another place to look for trending terms is Twitter. And the site gives analytics. Here you enter a hashtag and get the last 24 hours of Twitter traffic for that hashtag (at least in the free version). You can't do a comparison of hashtags but you can search any hashtag you wish. However you could highlight

Another place you can get trend data is This site does analytics on website traffic in general
You can get detailed analytics for free from any of the sites that are listed as directly measured.

The Analysis

Though with most of the trending sites, there is not much analysis to be done, we often hear about topics "trending" so these sites can be used to bring something concrete to class. But some simple analysis can be done with the Quantcast site by just importing the table of sites and you can do work on histograms and even bar graphs.

Sample Questions 

  • Find a trending topic on Twitter or Google. Verify the data using one of the trending analytic sites. Compare to a similar topic.
  • How does the traffic of the top 10 most popular sites compare to the next 10?
  • Are there any outliers in the set of most popular sites?

Download the Data

Quantcast data (Sheets, Sheets with graphs, Fathom, Fathom with Graphs, CODAP)

Let me know if you used this data set or if you have suggestions of what to do with it beyond this.

