## Saturday, November 21, 2015

### Anscombe's Quartet

Anscombe's Quartet is four two variable sets of data that have a particularly interesting property.

Upon examination the first three sets have the same x values but other than that the y values all seem random. But the interesting thing starts when you start to do some numerical analysis on them. Just start with some simple single variable calculations.
• Mean of each x set = 9
• Mean of each y set = 7.50
• Variance of each x set = 11
• Variance of each y set = 4.122-4.128
So, almost identical. And then if you take that a step further you can do the two variable analysis on each set and get the following:
• Correlation of each set = 0.816
• Line of best fit for each set y = 3 + 0.5x
So with all that analysis done, you might get the impression that these are pretty much just different aspects of the same sets of data. But then when you graph them you get something entirely different:
So you really see that they are very different sets of data. The lesson here is that your data cannot be fully described with either numerical or graphical analysis but really both are necessary.

## Classroom Connections

So how do you use this in class? This set is really best used for students who have had both single variable and two variable analysis. It really is a great set for tying together many of the concepts of data analysis.

One thing that you can do is use this Desmos Activity Builder that walks students through the analysis. Keep in mind that students should be familiar with calculating mean and variance via a spreadsheet. They should also be familiar with using Desmos in terms of graphing functions and doing linear regression.