tag:blogger.com,1999:blog-22710614019596437092024-03-05T23:08:36.074-08:00Found DataFinding data that can be analyzed in math classroomsDavid Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.comBlogger30125tag:blogger.com,1999:blog-2271061401959643709.post-78751244384535092432021-05-17T10:50:00.001-07:002021-05-17T13:20:22.356-07:00Introductory Statistics Data Cards<p>I love this set of data cards created by @DavidButlerUoA (be sure to check out the comments on the post for more info from him):</p><p></p><blockquote class="twitter-tweet"><p dir="ltr" lang="en">I've just designed a new set of data cards for use in stats workshops, especially with Health Science students. I'm very proud of them. <a href="https://t.co/68Nzfo14aM">pic.twitter.com/68Nzfo14aM</a></p>— David Butler (@DavidKButlerUoA) <a href="https://twitter.com/DavidKButlerUoA/status/1393116987979567112?ref_src=twsrc%5Etfw">May 14, 2021</a></blockquote> <script async="" charset="utf-8" src="https://platform.twitter.com/widgets.js"></script><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiX0Vwt44A8W0kQsAIGyrrwgHIi1CZU-1m_svNNSM0F3Gi0cuxvgkYpGwhWi2y9bi0n0BMsYdXBL62mLcE_SWAHbrFYCtSgS5wzIa0Kfng52xpyrFjUhqhn497hSZx5KFASL_XTPr1ZnKkZ/s1567/DataCards.jpeg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-="" data-original-width="1567" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiX0Vwt44A8W0kQsAIGyrrwgHIi1CZU-1m_svNNSM0F3Gi0cuxvgkYpGwhWi2y9bi0n0BMsYdXBL62mLcE_SWAHbrFYCtSgS5wzIa0Kfng52xpyrFjUhqhn497hSZx5KFASL_XTPr1ZnKkZ/s320/DataCards.jpeg" width="560" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYE0wU4_Wmvn1bwjyeSAT7s8McUQ7rUiauX45ewXBPb6BaTVRi60kTmcHxxOTSJRFRC5e7sNFRzXbuPlG90-5215gDFiWekFx76x5LIruSrFeXNo2DL3quczFbKiO6Hf_s5sgKKG0BW5wQ/s2048/cards.jpeg" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="1546" data-original-width="2048" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYE0wU4_Wmvn1bwjyeSAT7s8McUQ7rUiauX45ewXBPb6BaTVRi60kTmcHxxOTSJRFRC5e7sNFRzXbuPlG90-5215gDFiWekFx76x5LIruSrFeXNo2DL3quczFbKiO6Hf_s5sgKKG0BW5wQ/w200-h151/cards.jpeg" width="200" /></a></div>These are ideal for when you are just starting out talking about stats. Each card is a data point with ten attributes (name, age, height, heart rate, temp, mood, arms, headgear, pet, bike). To me, you give these cards out to students with the instruction to sort them in any way they see fit and then see what happens. I wouldn't even tell them which attributes you have and just let them come to their own discoveries. This is a really great way for students to ease into the idea of analyzing statistics in a painless and approachable way. You can see some of the results that @DavidButlerUoA got <a href="https://twitter.com/DavidKButlerUoA/status/1394093690985926656" target="_blank">here</a>, <a href="https://twitter.com/DavidKButlerUoA/status/1394128685918081026" target="_blank">here</a> and <a href="https://twitter.com/DavidKButlerUoA/status/1394168847586922501" target="_blank">here</a><br /><h3 style="text-align: left;">Analysis</h3><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrkwb2DRnEySv0BU6P1cS8dFC2Fw-KMsktcB6snOvyy5QGcI-Dmq8_NV9ifSNl3_zy6mLP2zEyfeXHzC4YkdtOWOmUA4mcSr0YforRg7Tncqq8zdnMla-3s7qeLOv6At1Kgfc6IaZuz3mH/s300/StickFigureDataCards-temp.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="295" data-original-width="300" height="197" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrkwb2DRnEySv0BU6P1cS8dFC2Fw-KMsktcB6snOvyy5QGcI-Dmq8_NV9ifSNl3_zy6mLP2zEyfeXHzC4YkdtOWOmUA4mcSr0YforRg7Tncqq8zdnMla-3s7qeLOv6At1Kgfc6IaZuz3mH/w200-h197/StickFigureDataCards-temp.png" width="200" /></a></div>Once you have informally had students interact with these cards, you can continue to refer to them as you talk about the difference between categorical and numeric data, do some single variable stats measurements, two variable correlation and more. All the while you can keep referring to the cards in a more human context as each of them represents one "person" (though the data is made up, some of the relationships were taken from health studies). So although you will not solve any statistical mysteries with this data set, it is quite rich and divers and can be used to demonstrate many different statistical concepts. <p></p><h3 style="text-align: left;">Sample Questions</h3><p></p><ul style="text-align: left;"><li>Sort these cards into any arrangement you wish. What patterns do you see? Be sure to justify your arrangement(s).</li><li>What is the probability that if a person is happy, they are dancing?</li><li>Could riding a bike make you healthier?</li></ul><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjz7iqa90jnI4ppsCU54uuJ0cd0ioSSr1t4Qvm9grxhetq-oFAhU02yAEg2jdMz1QzrfocLCS70aFS06_cDIiHG8n2toodT5uofRWXHFgvPgr4SC0SQuJg6zJ54O3UOptQq5sLy96lZPfjx/s305/StickFigureDataCards-bike.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="190" data-original-width="305" height="125" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjz7iqa90jnI4ppsCU54uuJ0cd0ioSSr1t4Qvm9grxhetq-oFAhU02yAEg2jdMz1QzrfocLCS70aFS06_cDIiHG8n2toodT5uofRWXHFgvPgr4SC0SQuJg6zJ54O3UOptQq5sLy96lZPfjx/w200-h125/StickFigureDataCards-bike.png" width="200" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUt5tQxmLnWsi75S7Cktpdu8AQ1Rfhf-Em7-cadmyOuuA6N5d-xbP-4m1wx6mewvXfHDu_STMi3Jp0nS_EeAicDfC8L1M_Pay0ENpWz6iPT0enzQU2P-UAzAkSMm3rCryIEKl1o7nK0vR_/s305/StickFigureDataCards-mood.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="190" data-original-width="305" height="125" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUt5tQxmLnWsi75S7Cktpdu8AQ1Rfhf-Em7-cadmyOuuA6N5d-xbP-4m1wx6mewvXfHDu_STMi3Jp0nS_EeAicDfC8L1M_Pay0ENpWz6iPT0enzQU2P-UAzAkSMm3rCryIEKl1o7nK0vR_/w200-h125/StickFigureDataCards-mood.png" width="200" /></a></div></div><h3 style="text-align: left;">Downloads</h3><p></p><p>Original Cards as <a href="https://drive.google.com/file/d/1E93G1L3rzgl15RLGbS2SMJ_656o78i9S/view" target="_blank">PDF</a> (ideally printed on card stock, cut, and laminated)<br />Data (<a href="https://drive.google.com/file/d/1sBntYOUnMXFTqv3EW5zKwIhHxlPFJhfJ/view" target="_blank">CSV</a>, <a href="https://docs.google.com/spreadsheets/d/1ppu1BOK2NlYXu6oM1_BbZa2F8nEyFQaj619N1GiT_20/edit?usp=sharing" target="_blank">Google Docs</a>, <a href="https://drive.google.com/file/d/1xE3NuXQLz1nEPHO_-uh9hXY4_RAsHpFm/view?usp=sharing" target="_blank">CODAP</a>)</p><p>Be sure to check out David's other math related teaching materials on his <a href="https://blogs.adelaide.edu.au/maths-learning/" target="_blank">Making Your Own Sense</a> blog </p><p>Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</p><p> </p>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-9520226840960382172021-05-16T10:44:00.006-07:002021-05-16T10:50:07.285-07:00Star Wars Data via Kaggle <p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZ_wyrZCTv0IHPmrjMocXX7ZyNaTn8yAwrCt8O25QfXf70RoplAxpAdCDGbW32vBXYA7pDW-sTqMg59C4d3zlRh1qw8FiOzfHJwRkO90Op54aqN1MBe4jPloDipX9DhSngiEPvKetWQB8M/s240/kaggle.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="86" data-original-width="240" height="70" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZ_wyrZCTv0IHPmrjMocXX7ZyNaTn8yAwrCt8O25QfXf70RoplAxpAdCDGbW32vBXYA7pDW-sTqMg59C4d3zlRh1qw8FiOzfHJwRkO90Op54aqN1MBe4jPloDipX9DhSngiEPvKetWQB8M/w195-h70/kaggle.png" width="195" /></a></div>Another repository of freely available data is called <a href="https://www.kaggle.com/" target="_blank">Kaggle</a>. "Inside Kaggle you’ll find all the code & data you need to do your data science work. Use over 50,000 public datasets and 400,000 public notebooks to conquer any analysis in no time." I like this repository because it seems to be easily searchable and there are a lot of data sets so you should be able to find one that is on an interesting topic for your students without too much trouble. <p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbE2U0rZsul35ohpBBYEtUkMznnfLNIuI89yHvfPj_Wa29RE6DoAnLqn7qlaShAjM212GFiJ1tN0ivylwONEA3mSY6U5z9zIacxgsZaoposjjKIPe35jrWvSKOamdjfzb-nRL5jCyYRQR2/s339/starwars.jpeg" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="149" data-original-width="339" height="88" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbE2U0rZsul35ohpBBYEtUkMznnfLNIuI89yHvfPj_Wa29RE6DoAnLqn7qlaShAjM212GFiJ1tN0ivylwONEA3mSY6U5z9zIacxgsZaoposjjKIPe35jrWvSKOamdjfzb-nRL5jCyYRQR2/w200-h88/starwars.jpeg" width="200" /></a></div>And to show case a data set, I'm choosing <a href="https://www.kaggle.com/jsphyg/star-wars" target="_blank">one suggested</a> to me by @virgonomic on data from the Star Wars franchise. And actually it's several data sets. <p></p><h3 style="text-align: left;">Analysis </h3><p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgM6YfKH0w9wHmU7j3S7r5Lz9WONXZLfrFKOatvMX_xaVWbost_zS3ldtmvKOoKZ9rblP997Tl3gQRCyXaRz5Nl8uBWOqjdZgaUUfPQryZuFh6VhyNlQH4gYhpSHG6610MbKHxC9ei4G__Y/s300/Starwarsspecies.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="295" data-original-width="300" height="197" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgM6YfKH0w9wHmU7j3S7r5Lz9WONXZLfrFKOatvMX_xaVWbost_zS3ldtmvKOoKZ9rblP997Tl3gQRCyXaRz5Nl8uBWOqjdZgaUUfPQryZuFh6VhyNlQH4gYhpSHG6610MbKHxC9ei4G__Y/w200-h197/Starwarsspecies.png" width="200" /></a>There are four CSV files, one on characters, species, planets, starships and vehicles. Now you are not going to be doing any ground breaking statistical work here as the context of these data sets are pretty niche to die hard Star Wars fans. Like, I'm not sure who will care that the Bantha-II cargo skiff has a one day supply of consumables. None the less these are good data sets to be used for basic stats (finding mean, standard deviation, correlation etc). You can definitely find many attributes that are categorical as well. One thing I did noticed is that with most of the sets there was always one or two things that could be used to talk about outliers. Like Jabba the Hutt in the Character's dataset or the rotational period of planets in the planet data set</p><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5u9v4dEZzL3BXhV3BCSzTRw7h_4DHFeG3Lc7MhTQfZJjpYbfEXueaS6AMecLxv2d3zzBrJ5aUVVPSZyMExgmqZ32sSJkltMhVQmqZdzDWy7J3qmX8uOcL9ROd_2H1j5H_L6F1HRRQsBRU/s978/Jabba.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="631" data-original-width="978" height="129" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5u9v4dEZzL3BXhV3BCSzTRw7h_4DHFeG3Lc7MhTQfZJjpYbfEXueaS6AMecLxv2d3zzBrJ5aUVVPSZyMExgmqZ32sSJkltMhVQmqZdzDWy7J3qmX8uOcL9ROd_2H1j5H_L6F1HRRQsBRU/w200-h129/Jabba.png" width="200" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheWi9ELhX4VgMqqgiZI3hEA3YX0i_tx_4e4a7Sl5Y9hc8r3sqSpdxbwNEDiu53eSvuvSIUjbK1LynKKyuF5Y0kYEkEMpgWcN_LJuwFpGHLtA8pSOoq_bUx9iCU-9sQvAMf91YLCqlyqxYb/s300/starwarsPlanets.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="240" data-original-width="300" height="160" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheWi9ELhX4VgMqqgiZI3hEA3YX0i_tx_4e4a7Sl5Y9hc8r3sqSpdxbwNEDiu53eSvuvSIUjbK1LynKKyuF5Y0kYEkEMpgWcN_LJuwFpGHLtA8pSOoq_bUx9iCU-9sQvAMf91YLCqlyqxYb/w200-h160/starwarsPlanets.png" width="200" /></a></div><br /></div></div><h3 style="text-align: left;">Sample Questions</h3><div><ul style="text-align: left;"><li><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGCn76HQG7ZsjIWzmRUGEDuYRBjyhZ7AKolPqL18N03lapOUHKku5by9GXNq65KOA-j39t48CUOkFeLAmSsShsPJo_eshZopkkiGAG_Q3ZTkUtzdBQ6i1tTLZEWsig4__NvWU1muKR47QB/s300/Starwarsvehicles2.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="295" data-original-width="300" height="197" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGCn76HQG7ZsjIWzmRUGEDuYRBjyhZ7AKolPqL18N03lapOUHKku5by9GXNq65KOA-j39t48CUOkFeLAmSsShsPJo_eshZopkkiGAG_Q3ZTkUtzdBQ6i1tTLZEWsig4__NvWU1muKR47QB/w200-h197/Starwarsvehicles2.png" width="200" /></a></div>When you consider the length of a vehicle compared to the number of crew it holds, are there any outliers?</li><li>What is the standard deviation of the _______ attribute in the _______ data set?</li><li>Find your favourite character. Pick and attribute and describe how your character compares to the others. </li></ul></div><div><br /></div>BONUS data: Though this is not from this data set, it was recently Star Wars day and someone posted this infographic comparing the number of lines each character spoke and what words they spoke the most in the original trilogy. <br /><br /><blockquote class="twitter-tweet"><p dir="ltr" lang="en">May the fourth be with you! Who has the most lines in the original Star Wars trilogy and what are their 20 top words?<a href="https://twitter.com/hashtag/dataviz?src=hash&ref_src=twsrc%5Etfw">#dataviz</a> <a href="https://twitter.com/hashtag/MayThe4thBeWithYou?src=hash&ref_src=twsrc%5Etfw">#MayThe4thBeWithYou</a> <a href="https://twitter.com/hashtag/MayTheFourthBeWithYou?src=hash&ref_src=twsrc%5Etfw">#MayTheFourthBeWithYou</a> <a href="https://t.co/WarvwX2XOf">pic.twitter.com/WarvwX2XOf</a></p>— Neil Kaye (@neilrkaye) <a href="https://twitter.com/neilrkaye/status/1389607247031021572?ref_src=twsrc%5Etfw">May 4, 2021</a></blockquote><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhs0PpYNwWxweM9DvnECp2a1nUNid7zDNg9fWkDvJRqcHABm2UiQUTSLrUrfYc4RepXbVfCAcYt1pi3ZOjLafZAf2VEHOjJtUFXN1-Gbq_0n0QJA-nwr8JOYi9Zj8LDVZ06RhWf6BxO00Oz/s2611/StarWarsLines.jpg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1205" data-original-width="2611" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhs0PpYNwWxweM9DvnECp2a1nUNid7zDNg9fWkDvJRqcHABm2UiQUTSLrUrfYc4RepXbVfCAcYt1pi3ZOjLafZAf2VEHOjJtUFXN1-Gbq_0n0QJA-nwr8JOYi9Zj8LDVZ06RhWf6BxO00Oz/w400-h184/StarWarsLines.jpg" width="560" /></a></div><p><br /></p><h3 style="text-align: left;">Downloads</h3><p style="text-align: left;"></p><ul style="text-align: left;"><li>Original Data - <a href="https://www.kaggle.com/jsphyg/star-wars" target="_blank">https://www.kaggle.com/jsphyg/star-wars</a></li><li>Entire <a href="https://drive.google.com/drive/folders/10LTggRFKJxk_p_oqGT1xZ0bH6ma7_7W2?usp=sharing" target="_blank">folder</a></li><li>Characters (<a href="https://drive.google.com/file/d/1WkjaSTqDYoq0wW8_iMXQN3bL07G-mefn/view?usp=sharing" target="_blank">CSV</a>, <a href="https://docs.google.com/spreadsheets/d/1NKJGDyUzluQBLK49EmvRSt2PQablVMy4mVTj996VkCI/edit?usp=sharing" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/file/d/1L-KCwmZUhW7rreagsgXhyNpU0W-B0o-X/view?usp=sharing" target="_blank">CODAP</a>)</li><li>Species (<a href="https://drive.google.com/file/d/10n_EXOQZrv5Ynz1VsKTLAWagppnhD4fv/view?usp=sharing" target="_blank">CSV</a>, <a href="https://docs.google.com/spreadsheets/d/17BR8CYXR64xBHN4QF0VtXmio8JyZa5Jn2i5DG0kIJ9k/edit?usp=sharing" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/file/d/14_7KTgStKhogaDxV1QRWFqMuaMUmRXIb/view?usp=sharing" target="_blank">CODAP</a>)</li><li>Planets (<a href="https://drive.google.com/file/d/1bO4yJmERe1ketjXPVjmRN_iAFN8oblbr/view?usp=sharing" target="_blank">CSV</a>, <a href="https://docs.google.com/spreadsheets/d/12AwjykPs7FpiUOksI9yy9QuubdeZVocxCIrtxmE1ms4/edit?usp=sharing" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/file/d/14-RqW5y3HPnHCSGPzS1LvYOTQ9yIEclu/view?usp=sharing" target="_blank">CODAP</a>)</li><li>Starships (<a href="https://drive.google.com/file/d/1jX6cE6Z64ZVFQQ0_c7KXdsKEW1txtk-a/view?usp=sharing" target="_blank">CSV</a>, <a href="https://docs.google.com/spreadsheets/d/1GKjDCMspXhNPTP9NIf_oJYbfT9tDE6jHRXU406a-aKw/edit?usp=sharing" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/file/d/1eUQf0eIdh3gZ7U-DbrC40ICpn18xU5_Z/view?usp=sharing" target="_blank">CODAP</a>)</li><li>Vehicles (<a href="https://drive.google.com/file/d/11qrAMmq7WRSTu-HcQUz6lIVw58p-lr4b/view?usp=sharing" target="_blank">CSV</a>, <a href="https://docs.google.com/spreadsheets/d/1TVxDnlcVPcQcl_fF3v0qUGnIuwerFOKObIacnkwM1Dw/edit?usp=sharing" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/file/d/18jo62IAGJy7o-JosX2Cg72dgmzgcogiS/view?usp=sharing" target="_blank">CODAP</a>)</li></ul>Let me know if you used this data set or if you have suggestions of what to do with it beyond this.<p></p>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-20662110053732650812021-05-15T17:51:00.003-07:002021-05-16T10:45:07.212-07:00The Big Bang Theory Ratings & Viewership via Data.World<div><a href="http://Data.World"></a><div class="separator" style="clear: both; text-align: center;"><a href="http://Data.World"></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO_Xey2mLF-XriMZ8JOFo9ckywlE4Ga6LbpwLZHc2Vk6KcF8WgrhgDgOpNe1nR-wQl-uUus4voQP7b1gYxwHSwwgK2KBOoAx9RsM8-4knotWZWCo3cFLga6KeG8C5t30XcCOOduP_Mmzzn/s422/Screen+Shot+2021-05-15+at+7.15.39+PM.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="116" data-original-width="422" height="55" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO_Xey2mLF-XriMZ8JOFo9ckywlE4Ga6LbpwLZHc2Vk6KcF8WgrhgDgOpNe1nR-wQl-uUus4voQP7b1gYxwHSwwgK2KBOoAx9RsM8-4knotWZWCo3cFLga6KeG8C5t30XcCOOduP_Mmzzn/w200-h55/Screen+Shot+2021-05-15+at+7.15.39+PM.png" width="200" /></a></div>Data.World is a great site for data sets and they all seem to be freely downloadable once you create an account. The site is a paid site but seems to be paid for people who use data in commerce. Members upload all kinds of data sets and you can search through them. </div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3cHmwKZZOhTpR8dx0u4vtv5K6aj376GAncLtgy6_Urgi5yF6psT7NG7AyZo8rFhSb1VSKsTf4UKpHDJ_t0nH8C_REMwege6y2gJJ_9AVie2DWFsY5XtPZ5XpbY8lKokSVFtdF7b3OYbTG/s1280/bigbangtheory.jpeg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="720" data-original-width="1280" height="113" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3cHmwKZZOhTpR8dx0u4vtv5K6aj376GAncLtgy6_Urgi5yF6psT7NG7AyZo8rFhSb1VSKsTf4UKpHDJ_t0nH8C_REMwege6y2gJJ_9AVie2DWFsY5XtPZ5XpbY8lKokSVFtdF7b3OYbTG/w200-h113/bigbangtheory.jpeg" width="200" /></a></div>To show that I've taken a sample data set about the Big Bang Theory TV show. It was a great show and it doesn't matter whether you didn't watch it when it first aired because you can probably find an episode of the Big Bang Theory on TV at just about any time of the day. So if you are looking for some data then two data bases (Wikipedia and IMDB) were scraped to get information like ratings, viewership, plot line and more and housed at <a href="https://www.blogger.com/blog/post/edit/2271061401959643709/2066211005373265081#" target="_blank">data.world</a>. </div><h3 style="text-align: left;">Analysis</h3><div>There are several attributes to this data set (including episode descriptions and titles) but you probably want to stick to the numerical ones. You can do single variable analysis of the number of viewers, the votes and the ratings and some double variable analysis. I like the single variable analysis because you can separate the seasons and do a separate analysis for each season. </div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuaE2R_CCTjOl9FGgHCXGoHCe_kTm-l0MJQWkXd_PBKbcTsuEy_Upvehf0Uko6jdRY3XGdKEXZ-g6mIwhZhR6-BPFXi6VE93BHvFN80cu044Ik-eJSaxG3gLeYH5Vgb1qMOPQGTwfwHYwi/s1814/BigSingle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="598" data-original-width="1814" height="193" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuaE2R_CCTjOl9FGgHCXGoHCe_kTm-l0MJQWkXd_PBKbcTsuEy_Upvehf0Uko6jdRY3XGdKEXZ-g6mIwhZhR6-BPFXi6VE93BHvFN80cu044Ik-eJSaxG3gLeYH5Vgb1qMOPQGTwfwHYwi/w590-h193/BigSingle.png" width="590" /></a></div><br /><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiV1QDeum8mkb_me9O3OI7Xov2hwGP_Elehwwc9JlOdi07LKOWUV3RjkyR9jJ5o__9GJcX36ZymDSn5ptj7RqkQCRszdg_0yU3gZV4LzvzMcXPsv_H4xoxfY0EkoRhEas_KIloRMYwLJKpI/s1202/BigDouble.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="550" data-original-width="1202" height="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiV1QDeum8mkb_me9O3OI7Xov2hwGP_Elehwwc9JlOdi07LKOWUV3RjkyR9jJ5o__9GJcX36ZymDSn5ptj7RqkQCRszdg_0yU3gZV4LzvzMcXPsv_H4xoxfY0EkoRhEas_KIloRMYwLJKpI/w584-h266/BigDouble.png" width="584" /></a></div></div><h3 style="text-align: left;">Sample Questions</h3><div>Which season had the highest average viewership?</div><div>Is there a connection between the rating and number of votes?</div><div>Which season(s) had the most popular episodes? </div><div><br /></div><div><h3 style="background-color: white; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; margin: 0px; position: relative;">Downloads </h3><ul style="background-color: white; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 1.4; list-style-image: initial; list-style-position: initial; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Original data: <a href="https://data.world/priyankad0993/big-band-theory-information" style="font-family: Times; font-size: medium;" target="_blank">https://data.world/priyankad0993/big-band-theory-information</a></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Raw data (<a href="https://docs.google.com/spreadsheets/d/1zpg1EKXpS4GDeTUwi7I7ktpxCUGe0ffT6TVy44s3b4c/edit?usp=sharing" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/file/d/1iqq8CTPkam_jMJiR71bivQU81ITTo9FL/view?usp=sharing" target="_blank">CSV</a>, <a href="https://www.desmos.com/calculator/nzxrf2rasf" target="_blank">Desmos</a>, <a href="https://drive.google.com/file/d/1hrpQSI58d35w9YFZJgErcwKoI27pkRuK/view?usp=sharing" target="_blank">CODAP</a>, <a href="https://drive.google.com/file/d/1k-yVEct4XaMWqg4SduPXNPYGM2w0CBNZ/view?usp=sharing" target="_blank">CODAPwithGraphs</a>)</li></ul><br style="background-color: white; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px;" /><span style="background-color: white; font-family: arial, tahoma, helvetica, freesans, sans-serif; font-size: 13px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><br />David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-22373213567978018612019-03-26T11:43:00.000-07:002019-03-26T11:43:46.924-07:00Mining the Meta Data in your iTunes Library<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDz_9ghhHWe3MxylkC19JRkVeLiRNaVDouXjM3Uz70Gq_d-3QaLvT0lg4Bxp65D3j2mch5t64TCAzuWtRAhUiWMrwq9Rs8GQ6eA0ncyZ9vkAtY5QlSdp8_yvr3musfCMb-DIbJnO8IuQWU/s1600/iTunesGenre.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="654" data-original-width="1574" height="165" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDz_9ghhHWe3MxylkC19JRkVeLiRNaVDouXjM3Uz70Gq_d-3QaLvT0lg4Bxp65D3j2mch5t64TCAzuWtRAhUiWMrwq9Rs8GQ6eA0ncyZ9vkAtY5QlSdp8_yvr3musfCMb-DIbJnO8IuQWU/s400/iTunesGenre.png" width="400" /></a>If you (or your students) use iTunes to keep track of your music then it turns out they have a rich source of data that might be interesting for your students to analyze. I find that if students use their own data they are more interested in looking at that data for analysis. In this case, every song on iTunes (and really, any platform) has a pile of meta data associated with it. In that meta data are things like song name, artist name, album name but also there are numerical values like song length, file size, number of plays etc. So you could have your students get the data from their own library and do the analysis of it.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3fPn8Uaeci3OD7nycgTYjpMShFEdHnhbQGeXJzhLrp9tsp5k5zuVGVe3JbLWjODkShb8PUQXUcoVi3_UcdcrUsEGFWUQo3qG8AzfKaERljWvoNZw6ebRZWYa9v_bDgg-XPyII4CzsEFSv/s1600/ITunesData.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="682" data-original-width="774" height="175" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3fPn8Uaeci3OD7nycgTYjpMShFEdHnhbQGeXJzhLrp9tsp5k5zuVGVe3JbLWjODkShb8PUQXUcoVi3_UcdcrUsEGFWUQo3qG8AzfKaERljWvoNZw6ebRZWYa9v_bDgg-XPyII4CzsEFSv/s200/ITunesData.png" width="200" /></a>Getting the data from iTunes is pretty easy. Once in iTunes, if they want to get the info from all their music then just click on Songs or if they want to get their data from a favourite playlist then they can click on that. Then click on File, then Library, then Export Playlist. It will then send a .TXT file to the folder of your choice. That .txt file will need a bit of cleaning up, but not much. I suggest importing it into Excel or Google Sheets to clean it up. If you are doing the work in that spreadsheet (or uploading to Desmos) then you're all set. If you plan on importing it into CODAP then save the data as a .CSV file (note that I noticed that even though you should be able to import a .TXT file into CODAP, the format of this one doesn't seem to work, so you have to convert it to a . CSV).<br />
<h3>
Analysis</h3>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3YyHYixTJAXY91EBmoHm4s2Jdr8lnS6qjCjWK-LuBpz4IFKIaR33vUtQVHg8FWDJXhXQAIERuvQfdUVvfCX0LzgMzhSkStA5axdPMSgpwZXwEndq1x3oofpL1fkRd-Ytp0a_6qn1xrtdC/s1600/iTunesDesmos.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="600" data-original-width="1600" height="120" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3YyHYixTJAXY91EBmoHm4s2Jdr8lnS6qjCjWK-LuBpz4IFKIaR33vUtQVHg8FWDJXhXQAIERuvQfdUVvfCX0LzgMzhSkStA5axdPMSgpwZXwEndq1x3oofpL1fkRd-Ytp0a_6qn1xrtdC/s320/iTunesDesmos.png" width="320" /></a>Though the data itself is not wildly interesting, you can certainly use it to cover topics like mean, median, standard deviation, and other single variable measures. And maybe have students compare values from their playlists to other students. Note, that the time of the songs are in seconds. So if a histogram is created, it is probably appropriate to have bin widths of 30s or 60s (let students figure this out).<br />
<br />
One thing that I think is interesting is that you would expect a very strong (if not perfect) relationship between the time of a song and it's file size. But as you can see there seems to be different relationships. This is due to the bit rate of the file compression. So you might be able to have a conversation about what bit rate is and how it relates to the compression of the file. The lower the bit rate the smaller the file size (for songs of the same length). So you could talk about why you would want a lower or higher bit rate (hint: lower bit rate means poorer quality of the sound but smaller file size, so there is a trade off). In CODAP you can create separate graphs of the bit rate data and the scatter plot of the size vs time then high lite parts of the data to show the different relationships. You could actually hide or show data based on the bit rate to do more specific analysis by isolating just the data from one bit rate.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRhO8MIOI6e4W_034LAyXMD7kJpz97hVMiGesNTZRNJvxjgcWgKufGFRQs-B3Fw5HkDB6XlodM93gzOuThKe-G-ZHZ8chbTw-cR7_7VJOwTiZQjyXn_D0_ihMU_vKMUWzOwebQITWQYqve/s1600/iTunesCODAP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="460" data-original-width="1600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRhO8MIOI6e4W_034LAyXMD7kJpz97hVMiGesNTZRNJvxjgcWgKufGFRQs-B3Fw5HkDB6XlodM93gzOuThKe-G-ZHZ8chbTw-cR7_7VJOwTiZQjyXn_D0_ihMU_vKMUWzOwebQITWQYqve/s400/iTunesCODAP.png" width="560" /></a></div>
<h3>
Sample Questions</h3>
<ul>
<li>Choose three numerical attributes from your data and determine the mean, median and SD of each. Graph each attribute using an appropriate representation.</li>
<li>Which genre of music has the highest average song length?</li>
<li>Which song was played the most?</li>
<li>Which decade has the most songs?</li>
<li>Which song was skipped the most?</li>
<li>Determine the relationship between the size of a file and how long the song is for different bit rates. </li>
<li>You have only 50 Mb of space left on your device. How many minutes of music could you store using all of the remaining space (note that answers will vary based on the bit rate.</li>
</ul>
<h3>
Downloads</h3>
<ul>
<li>Sample data from my iTunes Library (<a href="https://drive.google.com/open?id=1tp6K4znto2eXgrliIqu6Ha_9GSx-3kag4agykuXWVDE" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/open?id=1Ki-pQo0UscDcsrvJ-_phKjew4ty1bD2f" target="_blank">CSV</a>, <a href="https://www.desmos.com/calculator/cbswjeytnl" target="_blank">Desmos</a>, <a href="https://drive.google.com/open?id=1xATUXvYD7mDvPv84z0lqnWrTNQBMeUnD" target="_blank">CODAP</a>)</li>
<li>Some sample Graphs (<a href="https://www.desmos.com/calculator/jo5cka0ffd" target="_blank">Desmos</a>, <a href="https://drive.google.com/open?id=1Qf0l4kqdyu_Vb6Dao1DablAh5kJsP2to" target="_blank">CODAP</a>)</li>
</ul>
<div>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span></div>
<div>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;"><br /></span></div>
David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com2tag:blogger.com,1999:blog-2271061401959643709.post-46450505211252165192019-03-22T10:20:00.002-07:002019-03-22T10:20:38.617-07:00Hip Hop VocabularyThis <a href="https://pudding.cool/projects/vocabulary/" target="_blank">post originally</a> came out in<a href="http://ontariomath.blogspot.com/2014/05/math-links-for-week-ending-may-9th-2014.html" target="_blank"> 2014</a> (before this blog was created) and so I hadn't thought about it for a while. Then I saw a post by Dane Ehlert on his <a href="https://whenmathhappens.com/2019/03/07/wcydwt-hip-hop-vocabulary/" target="_blank">When Math Happens</a> blog and was not only reminded of it but noticed that the original post had been updated in look and with new data. Basically they take a pile of hip hop artists and count how many unique words they use in their first 35000 lyrics.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg66AcAbwVb3Pmavim0eDizYpPDePPOOws-Oz9aZ_1byRIuOSUwLiZypN5t2vFbOyyTYgz2RfBL31WDWn7W3WJs6jlEqr9UTk7uYi8QkE-ZLOqpoVAqwnwzSGbrdO-J53NTEPwLBt_02D6H/s1600/HipHopVocabulary.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg66AcAbwVb3Pmavim0eDizYpPDePPOOws-Oz9aZ_1byRIuOSUwLiZypN5t2vFbOyyTYgz2RfBL31WDWn7W3WJs6jlEqr9UTk7uYi8QkE-ZLOqpoVAqwnwzSGbrdO-J53NTEPwLBt_02D6H/s400/HipHopVocabulary.png" width="560" /></a></div>
<br />
<h3>
Analysis</h3>
When you go to the site, the visualization (above) is interactive in that you can search for artists and interact with the visualization. This is neat but on this blog we typically want to do some mathematical analysis. They have other representations like this one that looks like a histogram but for our purposes, we would like some numbers.<br />
<br />
<div style="text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8L-R0oZewGqOZtB_rsKg1XPeHc6P53esaiowTtCv7aU0u1IAn2bpOYR4oHxkzCHC6HcUf4ikOCm_f-e2bCyecgMeWewtAxiAZV9ZJLmxU2ZOzgCv6WfNmFu-uzAq5AXBanMia1VNgxbMP/s1600/HipHopVocabulary2.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8L-R0oZewGqOZtB_rsKg1XPeHc6P53esaiowTtCv7aU0u1IAn2bpOYR4oHxkzCHC6HcUf4ikOCm_f-e2bCyecgMeWewtAxiAZV9ZJLmxU2ZOzgCv6WfNmFu-uzAq5AXBanMia1VNgxbMP/s400/HipHopVocabulary2.png" width="560" /></a> </div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivBn0VfhdGAj-rk6R04HQqrTvYyV4-Z_n2Vcsmxze4ZQkFrceKCkgqMaJsOT3SvNAv-tsb-YNXYK4NhszpgxrYc9eBG36CBw6qYrnKjm7HnmCOVK15qDOhmuwgwH0xu2YlFocpFJUjA1fK/s1600/HipHopVocabulary3.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="546" data-original-width="600" height="181" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivBn0VfhdGAj-rk6R04HQqrTvYyV4-Z_n2Vcsmxze4ZQkFrceKCkgqMaJsOT3SvNAv-tsb-YNXYK4NhszpgxrYc9eBG36CBw6qYrnKjm7HnmCOVK15qDOhmuwgwH0xu2YlFocpFJUjA1fK/s200/HipHopVocabulary3.png" width="200" /></a>So if you look way down on the post, they do have a <a href="https://docs.google.com/spreadsheets/d/1HIIfgDpNMM-j0hoQHN-yP5P1lNOfJuvym0u0sdWwD9g/edit#gid=737896402" target="_blank">Google Sheet</a> with the number of unique words for each of the over 160 artists. It's not a particularly robust data set but we can do some simple <br />
analysis, like histogram, averages, box plots and other single variable analysis. I don't think there is anything particularly mathematically interesting with the data but this is data that might be interesting for students and so it could be used to do practice some standard single variable analysis techniques (central tendance, standard deviation, distributions, dot plots, box plots, histograms etc)<br />
<h3>
Sample Questions</h3>
<ul><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiphoJGq42tu-03tHHyqjjvmMMkKh53qPhtXfYMXLmftyi0UyoWkSzqJVhIp0Muiut9bH7i4NbzGqE8ACjKW8iOpSOhm64fTZG-GNXvl-zYqvRFoZwT2x2-CEI9x9iOXA4e-YoIH9wBHR2G/s1600/HipHopVocabulary4.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="744" data-original-width="1600" height="92" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiphoJGq42tu-03tHHyqjjvmMMkKh53qPhtXfYMXLmftyi0UyoWkSzqJVhIp0Muiut9bH7i4NbzGqE8ACjKW8iOpSOhm64fTZG-GNXvl-zYqvRFoZwT2x2-CEI9x9iOXA4e-YoIH9wBHR2G/s200/HipHopVocabulary4.png" width="200" /></a>
<li>Who are the outliers in this data set?</li>
<li>Which decade has the most verbose rappers?</li>
<li>How does your favourite rapper compare to the most/least verbose rapper?</li>
<li>Take a look at some of the questions Dane was asking in <a href="https://whenmathhappens.com/2019/03/07/wcydwt-hip-hop-vocabulary/" target="_blank">his post</a> for some more open questions.</li>
<li>What does the data in the original post say about the amount of words used in different types of music?</li>
</ul>
<h3>
Downloads </h3>
<ul>
<li>Original data:<a href="https://pudding.cool/projects/vocabulary/" target="_blank">https://pudding.cool/projects/vocabulary/</a></li>
<li>Raw data (<a href="https://docs.google.com/spreadsheets/d/1HIIfgDpNMM-j0hoQHN-yP5P1lNOfJuvym0u0sdWwD9g/edit#gid=737896402" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/open?id=1AKj1p7rxyNqAywcD7BYd1rVg48Sob5bc" target="_blank">CSV</a>, <a href="https://www.desmos.com/calculator/sqvto8ztnc" target="_blank">Desmos</a>, <a href="https://drive.google.com/open?id=1KEYNOUEOgpef7EuuZRMiTE_3sjLKEeld" target="_blank">CODAP</a>)</li>
</ul>
<br />
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span><br />
<div class="post-body entry-content" id="post-body-8292713709157135787" itemprop="description articleBody" style="-webkit-text-stroke-width: 0px; background-color: white; color: black; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.4; orphans: 2; position: relative; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; width: 586px; word-spacing: 0px;">
<div style="clear: both;">
</div>
</div>
<br />David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com2tag:blogger.com,1999:blog-2271061401959643709.post-82927137091571357872019-02-24T21:10:00.000-08:002019-04-01T08:51:31.767-07:00Skipping World RecordA few months back I saw a 3Act Task called <a href="https://gfletchy.com/rope-jumper/" target="_blank">Rope Jumper</a> that <a href="https://twitter.com/gfletchy" target="_blank">@gfletchy</a> created out of <a href="https://www.youtube.com/watch?v=GL3S4obePvw" target="_blank">this video</a>:<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/GL3S4obePvw/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/GL3S4obePvw?feature=player_embedded" width="530"></iframe></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGWi3S2wCoM9K0w_CNBek2uSSdHj4q6VFDEiiydVvTPvboDWTATgx0fzX5jSPq8YUQ7gLIZN7Jc7i2WLl1SJ3H1madeMss1AAXjUofHxcOXS0Oxr3AoYZ6HBD-Omjncj5oLlE3YbEfJz6T/s1600/Skipping-CODAP.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="876" data-original-width="1292" height="135" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGWi3S2wCoM9K0w_CNBek2uSSdHj4q6VFDEiiydVvTPvboDWTATgx0fzX5jSPq8YUQ7gLIZN7Jc7i2WLl1SJ3H1madeMss1AAXjUofHxcOXS0Oxr3AoYZ6HBD-Omjncj5oLlE3YbEfJz6T/s200/Skipping-CODAP.png" width="200" /></a>He shows the first few seconds of the video and you have to guess how many skips are done in 30s. It's a good 3Act task. But that's not what we're doing here. Here I've actually collected the time data from each skip to do a bit of analysis (I had to slow the video down to 50% speed in order to get every skip).<br />
<h3>
Analysis</h3>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgApYz7uNehXASXOEawaodws3JsXTcNbTnkLbE8HM8elDoL2uPZDfutg_PYsKFcCiShxtX-2LsXgs2IKo0ObYn0LBO6n2CBb_AyqDx0Ha6Wu-ZlPKhMEdk_wm-R5AShURWFsDl-2XjaADr3/s1600/Skipping-Desmos.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="1150" data-original-width="1416" height="161" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgApYz7uNehXASXOEawaodws3JsXTcNbTnkLbE8HM8elDoL2uPZDfutg_PYsKFcCiShxtX-2LsXgs2IKo0ObYn0LBO6n2CBb_AyqDx0Ha6Wu-ZlPKhMEdk_wm-R5AShURWFsDl-2XjaADr3/s200/Skipping-Desmos.png" width="200" /></a>As you would guess it's pretty linear but you might notice, as you watch the video, that it seems like she might be slowing down at times. It's not super exciting in terms of the actual data but it could be used to simply help students in determining the least squared line.<br />
<h3>
Sample Questions</h3>
<ul>
<li>When was she skipping the fastest/slowest and what was the rate?</li>
<li>How many skips do you think she would make in 1 minute?</li>
<li>If she was to keep the pace that she had in the first few seconds, how many skips would she have made in 30s?</li>
<li>If she had skipped at the same rate as she did in her slowest section, would she still have broken the record.</li>
</ul>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8Mx0SVRu-Xyn9qW73u1XhDysXCpnXnFbVK86hyphenhyphenHuSCsGVqBgwNe0ONbS2IKuEKTGC9mc5hTi4Ugy_D4xh3-dU0Mrk8yTa9Kes5uYQ0Cp-S3IVAYNA8zoh4-eyEhC2vQgqqVxRSYotto9j/s1600/SkippingGraph.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="329" data-original-width="488" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8Mx0SVRu-Xyn9qW73u1XhDysXCpnXnFbVK86hyphenhyphenHuSCsGVqBgwNe0ONbS2IKuEKTGC9mc5hTi4Ugy_D4xh3-dU0Mrk8yTa9Kes5uYQ0Cp-S3IVAYNA8zoh4-eyEhC2vQgqqVxRSYotto9j/s400/SkippingGraph.gif" width="560" /></a></div>
<div>
<br /></div>
<h3>
Downloads</h3>
<ul>
<li>Original data (<a href="https://drive.google.com/open?id=1GGAvwi_X4fcUMqG0Zd22oPyXsyfkPwq-" target="_blank">CSV</a>, <a href="https://drive.google.com/open?id=1FHCQIW3_C4WkfALpuneUSpPRl0ZGtgpJUP30gGjsx6A" target="_blank">Google Docs</a>, <a href="https://www.desmos.com/calculator/kfrkyh58af" target="_blank">Desmos</a>, <a href="https://drive.google.com/open?id=121zNHofgG2NneGZeLB_EdiIUxtRWQNc7" target="_blank">CODAP</a>)</li>
<li>Sample Analysis (<a href="https://drive.google.com/open?id=1xIf6-IwVd42Xh6TavtE8rkym2oAQ1l02Hn5G6KeFB_c" target="_blank">Google Docs</a>, <a href="https://www.desmos.com/calculator/za2symefha" target="_blank">Desmos</a>, <a href="https://drive.google.com/open?id=1Yjuma8Gy-k_RUAZIMmRBqeIgoS_d-5B1" target="_blank">CODAP</a>)</li>
</ul>
<div>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span></div>
David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-24917543245458484972019-02-07T21:44:00.000-08:002019-02-08T09:24:44.596-08:00New Desmos Statistics Package<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNbqZKQmk_0NlN7axHxNdL6SaP__Pkjs6pA4VzPP3dQJX451soXbVcvez1QTrvtdO7HP3V9tGC81f1nuRgFp06R2NEvgFp971N2qNX0mi_wGCOjyHwD-4HlcDS-eBqHeWD43TKVisRwGOd/s1600/DesmosStats.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="464" data-original-width="843" height="176" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNbqZKQmk_0NlN7axHxNdL6SaP__Pkjs6pA4VzPP3dQJX451soXbVcvez1QTrvtdO7HP3V9tGC81f1nuRgFp06R2NEvgFp971N2qNX0mi_wGCOjyHwD-4HlcDS-eBqHeWD43TKVisRwGOd/s320/DesmosStats.png" width="320" /></a>So for years you have been able to two variable statistics really well. Finding the correlation and lines and curves of best fit is pretty easy and works really well. But this week Desmos released a long awaited update to include a whole suite of new single variable statistical tools including visualizations like dot plots, box plots and histograms. And of course the great thing about all of this stuff is that all of these visualizations can be made dynamic with a few Desmos slider tricks. For a really nice summary of some of the new features, check out the video from @bobloch below.<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/SQT6RuPTxGs/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/SQT6RuPTxGs?feature=player_embedded" width="530"></iframe></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCWlRb7v7OkEsVnJzuOBYVHI7fLSWSO6WABcFhjpaBLJxHkY0T9t5sghN90N9NRqDVdCgZtVGRC8ussKNQl_aFHsL9odJ7Bt7Wl7dsBCu-IQMpIUH_mmGAueoQ1rMbFgP0BMJaSChtwmQQ/s1600/DesmosStatsIntrozoom.gif" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="514" data-original-width="1004" height="101" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCWlRb7v7OkEsVnJzuOBYVHI7fLSWSO6WABcFhjpaBLJxHkY0T9t5sghN90N9NRqDVdCgZtVGRC8ussKNQl_aFHsL9odJ7Bt7Wl7dsBCu-IQMpIUH_mmGAueoQ1rMbFgP0BMJaSChtwmQQ/s200/DesmosStatsIntrozoom.gif" width="200" /></a>But I wanted to point out a couple features that I really like. First of all the new Zoom Fit feature makes it easy to take any set of data and adjust the axes so that all the data can be seen. Basically all you do is create your graph and then click the icon that looks like the little magnifying glass with the plus in it. This icon will show up for any of the visualizations including the distributions. </div>
<div class="" style="clear: both; text-align: left;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL755LmrHSSoKtvdI9a5rEF9tty8YXADhdtzYQ_BcGlHSp92GH1fd92dbqIodT0mndyMkdBpbESiys7UaZB4lURCKZ71bxj1jMflVfaUjOpOEe6RZWfzz9Z1btMYou6GESxtubR8PDTz0s/s1600/DesmosStats+%25281%2529.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="270" data-original-width="440" height="122" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL755LmrHSSoKtvdI9a5rEF9tty8YXADhdtzYQ_BcGlHSp92GH1fd92dbqIodT0mndyMkdBpbESiys7UaZB4lURCKZ71bxj1jMflVfaUjOpOEe6RZWfzz9Z1btMYou6GESxtubR8PDTz0s/s200/DesmosStats+%25281%2529.png" width="200" /></a>Another thing that I like is the control that you get with the various graphs. When you enter any of the functions you will be told what the arguments are for the function (like for histograms you have the data and you have the bin width) or you have arguments outside the function. For example, for box plot you can change the vertical position (Offset) of the box and it's vertical size (Height). But any of those values can be turned into dynamic values by creating sliders or the results of computations. </div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEif5OWEBBAGKMGsnMDfQW22IC-zRWmHNcIHeCbfRbkaR58Xf9k6BqE4MkpUKp1HzWQuHME6dxRr67DkaUtLLrg3eaWtNHsNMqvS9uRun0osu8NsWs5SREL-V7m6PoqMdam3i91j0ZtSbX8J/s1600/DesmosStatsIntroSlider.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="514" data-original-width="1004" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEif5OWEBBAGKMGsnMDfQW22IC-zRWmHNcIHeCbfRbkaR58Xf9k6BqE4MkpUKp1HzWQuHME6dxRr67DkaUtLLrg3eaWtNHsNMqvS9uRun0osu8NsWs5SREL-V7m6PoqMdam3i91j0ZtSbX8J/s640/DesmosStatsIntroSlider.gif" width="560" /></a></div>
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAuKPkfAdHsTA-dHd34ldn76hebK5eWTsfG5W4yiCCCTFd_kPtApwnzsQix8uc8EO0WDdEMxuC4CjeCs8bm3XGISMWLZNf0FWUBGwWHVfa37oFJDsTK9n89NG2YLFEFup9l8SrJNE0BahX/s1600/DesmosLabel.gif" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="51" data-original-width="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAuKPkfAdHsTA-dHd34ldn76hebK5eWTsfG5W4yiCCCTFd_kPtApwnzsQix8uc8EO0WDdEMxuC4CjeCs8bm3XGISMWLZNf0FWUBGwWHVfa37oFJDsTK9n89NG2YLFEFup9l8SrJNE0BahX/s1600/DesmosLabel.gif" /></a>Like all Desmos graphs you can save your work and this is probably the best way to get large data sets to students. And if you want to name your sets, you can get a bit more creative by using subscripts. To get to a subscript, start with a variable and then add a "1" and the subscript will appear. Then you can delete the 1 and add what ever you want in its place. Try it out with these data sets from previous posts: <a href="https://www.desmos.com/calculator/0ocyqbrbxp" target="_blank">NFL Salaries</a> or <a href="https://www.desmos.com/calculator/mmydgvnwss" target="_blank">Concert Tours</a><br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
That's a quick intro of the new features. Don't forget to check out the Desmos help files on <a href="https://support.desmos.com/hc/en-us/articles/360022405991" target="_blank">visualizations</a>, <a href="https://support.desmos.com/hc/en-us/articles/360022401451#distributions" target="_blank">distributions</a> and <a href="https://support.desmos.com/hc/en-us/articles/360022401451-Statistics" target="_blank">statistics</a> for more info. Going forward, I will be including Desmos versions of the data sets I post so that you'll have your choice of software to use. Have fun.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxpO72OFNCeFPHPuSkifwTkU5Q3laMN0cjJO-vZh0FOQE23Tw8IS94GviIP02WjcwFCtQwmPCL59ALqUSKqSnozk_qnISK9PJaaeamcB3OcNeUrvqrRhEQFVLM6HriT5XIuDhZCzSCkKjU/s1600/DesmosStatsIntro.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="514" data-original-width="1004" height="203" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxpO72OFNCeFPHPuSkifwTkU5Q3laMN0cjJO-vZh0FOQE23Tw8IS94GviIP02WjcwFCtQwmPCL59ALqUSKqSnozk_qnISK9PJaaeamcB3OcNeUrvqrRhEQFVLM6HriT5XIuDhZCzSCkKjU/s400/DesmosStatsIntro.gif" width="400" /></a></div>
<br />
<br />David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com1tag:blogger.com,1999:blog-2271061401959643709.post-56058622022637615922019-01-04T08:56:00.000-08:002019-02-07T09:07:46.953-08:00Highest Grossing Concert Tours<a href="https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/U2_360_Tour_Croke_Park_2.jpg/500px-U2_360_Tour_Croke_Park_2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="281" data-original-width="500" height="179" src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/U2_360_Tour_Croke_Park_2.jpg/500px-U2_360_Tour_Croke_Park_2.jpg" width="320" /></a>Concerts are a multi billion dollar industry now. So why not use some concert data to do some statistical analysis. This data comes from the <a href="https://en.wikipedia.org/wiki/List_of_highest-grossing_concert_tours" target="_blank">wikipedia page</a> on the same subject. On the page the data is broken up into the top 20 all time highest grossing concerts (ordered by unadjusted by inflation numbers). Then it has the top grossing tours for each decade from the 80s until the present. There is data on the decade rank, gross and inflation adjusted gross, the number of shows attendance and other attributes.<br />
<h3>
Analysis</h3>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEildtpHbP0Xbx80je2YRhg-zrT2uTclfcaCc2TVRIbG8608ldb5XSFL83XlTALS3T9JO3KiSHv_-9pkfTLM8mkIfJKFeFznJ7uOCunqK6Hz_rRI6NEqaofHVwf4S6cf-xNtt_KI6U5O3c_F/s1600/Artists.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="598" data-original-width="1060" height="179" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEildtpHbP0Xbx80je2YRhg-zrT2uTclfcaCc2TVRIbG8608ldb5XSFL83XlTALS3T9JO3KiSHv_-9pkfTLM8mkIfJKFeFznJ7uOCunqK6Hz_rRI6NEqaofHVwf4S6cf-xNtt_KI6U5O3c_F/s320/Artists.png" width="320" /></a>You can start with some categorical analysis by just looking at the who made the list each year. This data runs for four decades so kids might not be into who was big in the 80s but if you highlight the biggest acts of the last decade you can still see that more than half of them were artists that were around in the 80s (with U2 being #1) and U2, Guns n Roses and The Rolling Stones (twice) were in the top 5 of all time (inflation adjusted).<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXfk2FAWGTdd6sDmjnYDguRTXZ9J70_o7kv_zXEUb52hvmzSGIBjGrNjQA3K0F93ZYtm_2Xs8yoKybLFrTWxuPnYcOog9XjHiPE4EXtkbmFQLk9Ac7XszWpO7SUvjRKRu8Pbm3iuBnRdOB/s1600/ConcertHistogram.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="556" data-original-width="1600" height="110" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXfk2FAWGTdd6sDmjnYDguRTXZ9J70_o7kv_zXEUb52hvmzSGIBjGrNjQA3K0F93ZYtm_2Xs8yoKybLFrTWxuPnYcOog9XjHiPE4EXtkbmFQLk9Ac7XszWpO7SUvjRKRu8Pbm3iuBnRdOB/s320/ConcertHistogram.png" width="320" /></a>For more numerical analysis you could pick any of the data sets to do some single variable analysis. Whether it be central tendency, distributions, or histograms. There are many choices.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9Dwma63aIAMRtLRZ4618Hcgq2IsPUPxz_D9hJacQZy-fcUEAB4Tv0vBeSVQYeVAsz_GHr51joiOmCWJ42f5_c_f6F6kec1AnPbUMw0ZbXv-geJBvmwYQfcKSef2unOtgZ_9YTh9QqO14C/s1600/ConcertGross.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="420" data-original-width="1114" height="120" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9Dwma63aIAMRtLRZ4618Hcgq2IsPUPxz_D9hJacQZy-fcUEAB4Tv0vBeSVQYeVAsz_GHr51joiOmCWJ42f5_c_f6F6kec1AnPbUMw0ZbXv-geJBvmwYQfcKSef2unOtgZ_9YTh9QqO14C/s320/ConcertGross.png" width="320" /></a>When you create some box plots you will find that some of the data sets have outliers. In particular, I think it's interesting that the outliers when dealing with the money are different from the outliers when dealing with the number of shows. This might lead you to explore things like the the Average Gross and compare it to the money and number of shows.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiM4QtZpl_41UXfLXPkjhw0KNZxp0V2NpsriIORiytvJ6cIceu9mlm55xftebJCcfy0STsG8EQBGhpICpYTrfN-Nd9qjsAPemIShw6Xb1-Cq3NDtVC89g4G0w1a5Yx5L-eYP0ygfdEvbk5c/s1600/InflationConcert.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="886" data-original-width="1142" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiM4QtZpl_41UXfLXPkjhw0KNZxp0V2NpsriIORiytvJ6cIceu9mlm55xftebJCcfy0STsG8EQBGhpICpYTrfN-Nd9qjsAPemIShw6Xb1-Cq3NDtVC89g4G0w1a5Yx5L-eYP0ygfdEvbk5c/s320/InflationConcert.png" width="320" /></a>This might lead you to do some double variable analysis. Though there aren't any strong relationships, you could use this to maybe talk about relationships with poor correlations. Technically there is one strong relationship. That's the one between the Gross and the Inflation adjusted gross. This would be expected as one relates directly to the other. One thing that I like about this, however, is that it's not a perfect relationship. That is, who ever adjusted for inflation did so using different rates for each year (to make it more realistic, presumably).<br />
<h3>
Sample Questions</h3>
<br />
<ul>
<li>Which Artist made the most (over all/ or per concert)?</li>
<li>Which decade made the most money (adjusted for inflation)?</li>
<li>Which artists are outliers the most often?</li>
<li>Calculate the mean and median for each of the numeric attributes. How do these values suggest something about the distributions?</li>
</ul>
<br />
<h3>
Downloads</h3>
<br />
<ul>
<li>Original <a href="https://en.wikipedia.org/wiki/List_of_highest-grossing_concert_tours" target="_blank">data set</a></li>
<li>Original Data Google <a href="https://drive.google.com/open?id=1djhHWHCXOB521qfy2C18ExQJ6nupB7esRe6PR8gijh0" target="_blank">Sheet</a>, <a href="https://drive.google.com/open?id=1JCDWbnT_uuOyEDGApWkJ7AKB4ntIzxI08E7tPXUlwzw" target="_blank">CSV</a>, <a href="https://drive.google.com/open?id=1O8NDxSp_AYmHlisPr6t9NJvFB1Jon0zd" target="_blank">CODAP</a>, <a href="https://www.desmos.com/calculator/mmydgvnwss" target="_blank">Desmos</a></li>
<li>With some graphs: <a href="https://drive.google.com/open?id=1alcfS2AqKn-SwEOpccc7sDZzk7WO88qiWd1Dvpckx9Q" target="_blank">Sheets</a>, <a href="https://drive.google.com/open?id=1AFkvtcW3svi5Sp48rtVwUWgGTMaKIw7R" target="_blank">CODAP</a></li>
</ul>
<br />
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span><br />
<div>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;"><br /></span></div>
David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com2tag:blogger.com,1999:blog-2271061401959643709.post-16541047959875261972018-11-10T11:09:00.000-08:002018-11-10T11:09:23.828-08:00Notre Dame University - "The Shirt"<h3>
Guest Post - by Michael Lieff (@virgonomic)</h3>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh70oLjSGd7810ViYZOdvzp8tSttBvL2HSXVa4L0b0X44Tn5JW3Szdxsy6yEs2rFyPkypwO_RVjmdXZCknnKLPFwgMQicF3PPuSVT79pyBY2VD_hyMI7EPIYc5GQlPRaXJaE0rq4etBUx0/s1600/notre+dame+-+the+shirt+2017.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="375" data-original-width="300" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh70oLjSGd7810ViYZOdvzp8tSttBvL2HSXVa4L0b0X44Tn5JW3Szdxsy6yEs2rFyPkypwO_RVjmdXZCknnKLPFwgMQicF3PPuSVT79pyBY2VD_hyMI7EPIYc5GQlPRaXJaE0rq4etBUx0/s200/notre+dame+-+the+shirt+2017.png" width="160" /></a><br />
Every year for the last 15 years, my neighbour, who is a die hard Fighting Irish fan, has planned a driving trip to Notre Dame University near South Bend, Indiana. I attended for the first time in 2017 and again in 2018. After a travel day, the first stop on the campus tour is the bookstore. In the lobby, they have a table with one style of short- and long-sleeve t-shirts. In 2017 "<a href="https://theshirt.nd.edu/" target="_blank">the shirt</a>" was navy and it didn't really grab me.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEii5lvw-5CcJYQB6wC_tFKx_ovFmQKp-ejh3P1bq2asad1OachyB2esTC34db2y5KVRzdYl9TBPXrD14fsyQVEHN_IzrPWqCs07NeruBtao60vrYUKceQ8yeuoIGJMw6APZTUg3Exn3eM0/s1600/notre+dame+-+the+shirt+2018.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="300" data-original-width="300" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEii5lvw-5CcJYQB6wC_tFKx_ovFmQKp-ejh3P1bq2asad1OachyB2esTC34db2y5KVRzdYl9TBPXrD14fsyQVEHN_IzrPWqCs07NeruBtao60vrYUKceQ8yeuoIGJMw6APZTUg3Exn3eM0/s200/notre+dame+-+the+shirt+2018.png" width="200" /></a><br />
However, in 2018 the shirt was kelly green which drew me in, as green is my favourite colour. I read the price tag and learned that "the shirt" is a student initiative and the proceeds go back into student activities and assistance. At $18 USD it was a no-brainer.<br />
<br />
Once I had my shirt, I visited the URL on the price tag. There is a link to a timeline that shows the shirt design from every year, and more importantly, the number of shirts sold, the team's record and the shirt manufacturer. Found data! Even more interesting is that there is no data for number sold for the years 1994-1996.<br />
<h3>
Analysis</h3>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFR2M7fzcSeiSgvInul5wpeB6uZ-r8G10yZCOzDpk1bgyM08ahEX-y_PjVwGJAykk1NnvpghFKgXiqlnjs1BaObMba1kZwaPhiAtY6lYAjmdIoa9vt33w-lZGjUVNXHdcuQe89sEEmjkg/s1600/Notre+Dame+-+The+Shirt+-+NumSold+vs+Year.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="400" data-original-width="425" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFR2M7fzcSeiSgvInul5wpeB6uZ-r8G10yZCOzDpk1bgyM08ahEX-y_PjVwGJAykk1NnvpghFKgXiqlnjs1BaObMba1kZwaPhiAtY6lYAjmdIoa9vt33w-lZGjUVNXHdcuQe89sEEmjkg/s200/Notre+Dame+-+The+Shirt+-+NumSold+vs+Year.png" width="200" /></a>The first question that came to my mind is: how many shirts did they sell from 1994-1996? Due to this gap, the dataset is a really nice example to explore interpolation and extrapolation. I figured the trend would be linear and the line of best fit would give a pretty logical prediction. Upon visualization, it definitely isn't cut-and-dried.<br />
<br />
There are some interesting things going on here.The number of shirts sold dropped fairly significantly from 1993 to 1997. It also skyrocketed in 2002 and then plummeted in 2004. Possible reasons for this would make for an interesting discussion.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhScQe5hET4pDoI9Ey65aZ4CdvxiwrPnHCvVmCuI5Q15Y6W64xltu39sfoHnSQ1XOokHfRyFk83bE7bbLEgbNEq3rq6AlKoZEJMWd8qnVDYq9xyl1YrryM8_FfEpIvin-p7rO4rx5m1TVQ/s1600/Notre+Dame+-+The+Shirt+NumSold+vs+W.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="400" data-original-width="425" height="188" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhScQe5hET4pDoI9Ey65aZ4CdvxiwrPnHCvVmCuI5Q15Y6W64xltu39sfoHnSQ1XOokHfRyFk83bE7bbLEgbNEq3rq6AlKoZEJMWd8qnVDYq9xyl1YrryM8_FfEpIvin-p7rO4rx5m1TVQ/s200/Notre+Dame+-+The+Shirt+NumSold+vs+W.png" width="200" /></a>Drilling a bit deeper, the next question that came to mind is: Are more shirts sold in seasons where the team is winning?<br />
<br />
It doesn't appear so, but I will let you 'do the math'.<br />
<h2>
</h2>
<h3>
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-CA</w:LidThemeOther>
<w:LidThemeAsian>X-NONE</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="false"
DefSemiHidden="false" DefQFormat="false" DefPriority="99"
LatentStyleCount="375">
<w:LsdException Locked="false" Priority="0" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 9"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 9"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footnote text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="header"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footer"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index heading"/>
<w:LsdException Locked="false" Priority="35" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="table of figures"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="envelope address"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="envelope return"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footnote reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="line number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="page number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="endnote reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="endnote text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="table of authorities"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="macro"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="toa heading"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 5"/>
<w:LsdException Locked="false" Priority="10" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Closing"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Signature"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="true"
UnhideWhenUsed="true" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Message Header"/>
<w:LsdException Locked="false" Priority="11" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Salutation"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Date"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text First Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text First Indent 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Note Heading"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Block Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Hyperlink"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="FollowedHyperlink"/>
<w:LsdException Locked="false" Priority="22" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Document Map"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Plain Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="E-mail Signature"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Top of Form"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Bottom of Form"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal (Web)"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Acronym"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Address"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Cite"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Code"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Definition"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Keyboard"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Preformatted"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Sample"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Typewriter"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Variable"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal Table"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation subject"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="No List"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Contemporary"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Elegant"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Professional"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Subtle 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Subtle 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Balloon Text"/>
<w:LsdException Locked="false" Priority="39" Name="Table Grid"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Theme"/>
<w:LsdException Locked="false" SemiHidden="true" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" SemiHidden="true" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" QFormat="true"
Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" QFormat="true"
Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" QFormat="true"
Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" QFormat="true"
Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" QFormat="true"
Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" QFormat="true"
Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" SemiHidden="true"
UnhideWhenUsed="true" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="TOC Heading"/>
<w:LsdException Locked="false" Priority="41" Name="Plain Table 1"/>
<w:LsdException Locked="false" Priority="42" Name="Plain Table 2"/>
<w:LsdException Locked="false" Priority="43" Name="Plain Table 3"/>
<w:LsdException Locked="false" Priority="44" Name="Plain Table 4"/>
<w:LsdException Locked="false" Priority="45" Name="Plain Table 5"/>
<w:LsdException Locked="false" Priority="40" Name="Grid Table Light"/>
<w:LsdException Locked="false" Priority="46" Name="Grid Table 1 Light"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark"/>
<w:LsdException Locked="false" Priority="51" Name="Grid Table 6 Colorful"/>
<w:LsdException Locked="false" Priority="52" Name="Grid Table 7 Colorful"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 1"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 1"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 1"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 1"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 1"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 2"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 2"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 2"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 2"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 2"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 3"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 3"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 3"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 3"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 3"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 4"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 4"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 4"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 4"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 4"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 5"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 5"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 5"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 5"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 5"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 6"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 6"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 6"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 6"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 6"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="46" Name="List Table 1 Light"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark"/>
<w:LsdException Locked="false" Priority="51" Name="List Table 6 Colorful"/>
<w:LsdException Locked="false" Priority="52" Name="List Table 7 Colorful"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 1"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 1"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 1"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 1"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 1"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 2"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 2"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 2"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 2"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 2"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 3"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 3"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 3"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 3"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 3"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 4"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 4"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 4"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 4"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 4"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 5"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 5"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 5"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 5"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 5"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 6"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 6"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 6"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 6"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 6"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Mention"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Smart Hyperlink"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Hashtag"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Unresolved Mention"/>
</w:LatentStyles>
</xml><![endif]--><!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Calibri",sans-serif;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;
mso-fareast-language:EN-US;}
</style>
<![endif]-->
</h3>
<div class="MsoNormal">
<h3>
Sample Questions</h3>
</div>
<div class="MsoNormal">
In terms of analysis, the following questions could be asked:<br />
<ul>
<li>Is the trend linear or is a curve a better model?</li>
<li>Can you interpolate the number of shirts sold in 1994-1996
where there is missing data? Extrapolate the number sold in 2018 or beyond?</li>
<li>What are the mean, median and mode number sold?</li>
<li>Do the number of shirts sold correlate with the team’s wins
that season?</li>
</ul>
<h3>
Download the Data</h3>
<ul>
<li>Original Data: <a href="https://theshirt.nd.edu/" target="_blank">https://theshirt.nd.edu/</a></li>
<li>The timeline <a href="https://theshirt.nd.edu/history/timeline/" target="_blank">https://theshirt.nd.edu/history/timeline/</a> </li>
<li><a href="https://drive.google.com/open?id=1-OC-nxUWgwmwKFIknG91MTy3tCJR2uiJ" target="_blank">CSV</a> Version</li>
<li><a href="https://drive.google.com/open?id=1bv8axMBTO7WPG2W7JCc0geWhqClrIPfl" target="_blank">CODAP</a> Version </li>
<li><a href="https://drive.google.com/open?id=1pJ2W5eUv7AD39Cg_MmCVWNRljcoq8YEM1zMoYrBS3ds" target="_blank">Google</a> Sheets Version</li>
</ul>
Let us know if you use this dataset or have any suggestions for things to do with it beyond this.</div>
virgonomichttp://www.blogger.com/profile/12357471138493293752noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-27084703515298532812018-11-05T13:26:00.000-08:002019-02-07T09:06:36.059-08:002018 NFL SalariesWe have a local NFL player that went to high school in one of the schools I support. Luke Willson was recently on the Seattle Seahawks and currently is on our local Detroit Lions. In conversation, a coworker wondered how much his salary was. The Internet <a href="https://www.pro-football-reference.com/players/salary.htm" target="_blank">provides</a>. Not only his salary, but the salary of every one of the almost 1800 players (who knew there were so many?).<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjC6u028e2yRoQW5fw8e4ZW5vFBStqxqXvQ2GqaUNVSzjDrLhK1zFHpW1ea2F7ZpJ7HQSi8RXdT18Ca873jUi05mNa2YoyoWPbY7oq7VKSeHygqvecy-SD14ltNpckr7cykl3zddRu_60sZ/s1600/NFLSpreadsheet.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="442" data-original-width="1048" height="167" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjC6u028e2yRoQW5fw8e4ZW5vFBStqxqXvQ2GqaUNVSzjDrLhK1zFHpW1ea2F7ZpJ7HQSi8RXdT18Ca873jUi05mNa2YoyoWPbY7oq7VKSeHygqvecy-SD14ltNpckr7cykl3zddRu_60sZ/s400/NFLSpreadsheet.png" width="400" /></a>And when you have such a large data set, I think that you should analyze it. It's not a particularly deep topic. But it's a good data set to talk about mean, median, skewing and outliers. Not anything super interesting from a data perspective but the context may be interesting enough to capture the interest of some of your students to do basic single variable analysis. The data includes info about a player's name, salary, position, team, overall rank and I added the team rank. There are 32 teams and a bit over 50 players per team.<br />
<h3>
Analysis</h3>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia7jqxidvnWxL4aspPMz3_fspWZuv-0W-ZsREOExRXA4xP5Z1Eawi4v2EpzbBISEK-Ki5AAyNKSea2PavEH8vRMTs_WHOZuH8iu9VdRDmCPTuJIEKqKHDIv838bdE1M1rLJapv_dQkG5gY/s1600/NFL+Boxplot.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="340" data-original-width="830" height="163" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia7jqxidvnWxL4aspPMz3_fspWZuv-0W-ZsREOExRXA4xP5Z1Eawi4v2EpzbBISEK-Ki5AAyNKSea2PavEH8vRMTs_WHOZuH8iu9VdRDmCPTuJIEKqKHDIv838bdE1M1rLJapv_dQkG5gY/s400/NFL+Boxplot.png" width="400" /></a></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTaLoHQ1-zMcLZBz3KtiyEq2Vw-2MPRIaElS3bLj2zPRfvThfYeUIxvy-wjsZNPNaiDepSjyitrIc4P8zXKcpinmGhZIZBlGEFOjtHMV_JpOTuqjI04ZLUOQLcU0DNS-tsl2nrqGHnKeLl/s1600/NFL-Hisotgram.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="634" data-original-width="1600" height="157" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTaLoHQ1-zMcLZBz3KtiyEq2Vw-2MPRIaElS3bLj2zPRfvThfYeUIxvy-wjsZNPNaiDepSjyitrIc4P8zXKcpinmGhZIZBlGEFOjtHMV_JpOTuqjI04ZLUOQLcU0DNS-tsl2nrqGHnKeLl/s400/NFL-Hisotgram.png" width="400" /></a>Certainly some things you can do are to create some graphs. The first types that comes to mind is a dot plot, box plot and histogram. In this case the dot and box plot are provided by CODAP while the histogram comes from Google Sheets. You can see from the dot plot that the mean and median are quite separated (which we would expect from the skewing) and that there are a large number of outliers.<br />
<br />
Since we were talking about Luke Willson, we could certainly ask how his salary compares to other NFL players (he's 455th) or other players on his team (he's 18th of 56) or even how he compares to other people the same position (21st of about 126 tight ends and is above the mean tight end salary)<br />
<h3>
Sample Questions</h3>
<div class="separator" style="clear: both; text-align: left;">
</div>
<ul><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1iW3sNVk_oenOGYjt5OSSMK_1SdVLIi2TZb9P7UxwyOfnf8aq4IY8EyG4cxqex4iAImghJYTQDeCWiO67FeCr-5cKE2K0SwUl6dQoHLpOW32BvaHCmgFjcbm1j4jMpZAfklsoX4031XEk/s1600/NFL+Tight+Ends.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="487" data-original-width="1022" height="152" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1iW3sNVk_oenOGYjt5OSSMK_1SdVLIi2TZb9P7UxwyOfnf8aq4IY8EyG4cxqex4iAImghJYTQDeCWiO67FeCr-5cKE2K0SwUl6dQoHLpOW32BvaHCmgFjcbm1j4jMpZAfklsoX4031XEk/s320/NFL+Tight+Ends.png" width="320" /></a>
<li>Determine the mean, median and standard deviation for the salaries attribute.</li>
<li>Which team has the highest mean salary? median salary?</li>
<li>Choose a player of your choice, how do they compare to the league, team and position?</li>
<li>Besides the way it looks, what confirms that this data is skewed to the right?</li>
<li>Which team has the highest number of outliers?</li>
</ul>
<br />
<h3 style="clear: both; text-align: left;">
Download the Data</h3>
<div class="separator" style="clear: both; text-align: left;">
</div>
<ul>
<li>Raw data Google <a href="https://drive.google.com/open?id=1av7PiZ8RXxAZwNL47UUFWKr1pGTcYPNorfjwsK0fY1M" target="_blank">Sheets</a>, <a href="https://drive.google.com/open?id=1hcOK6ONrz06pfzgYggTttd5NgFOowxrF" target="_blank">CSV</a>, <a href="https://drive.google.com/open?id=13YiacqS251Ra_in2-f5I9uk-n9xgZWHh" target="_blank">CODAP</a>, <a href="https://www.desmos.com/calculator/0ocyqbrbxp" target="_blank">Desmos</a></li>
<li>Some graphs Google <a href="https://drive.google.com/open?id=18dMmmMkUHD3OXuhWdrFNyj68zjXmc9jnydfXj6R-9og" target="_blank">Sheets</a>, <a href="https://drive.google.com/open?id=1_hVnY-mQUXc77bvSfT1Jrbh9F5s0xeyS" target="_blank">CODAP</a></li>
<li>Original <a href="https://www.pro-football-reference.com/players/salary.htm" target="_blank">data</a> </li>
</ul>
<div>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span></div>
David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com1tag:blogger.com,1999:blog-2271061401959643709.post-73856610870573374802018-10-26T14:54:00.000-07:002020-03-08T12:11:03.375-07:00Walnut Crushing World Record (with 3 Act Task)Check out this video (thanks to <a href="https://twitter.com/ddmeyer/status/1055229889836593152" target="_blank">@ddmeyer for pointing</a> this one out).<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/i1PQX64cTgY/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/i1PQX64cTgY?feature=player_embedded" width="530"></iframe></div>
<br />
So the guy crushes walnuts with his head and what we get is a linear relationship. There are a few things here. First off, there is a 3 Act task. I modelled the 3 Act task off of <a href="https://twitter.com/gfletchy" target="_blank">@Gfletchy</a>'s similar task for <a href="https://gfletchy.com/rope-jumper/" target="_blank">rope jumping</a>. Secondly, I timed how long it took for each walnut to get crushed and collected in file (if you are interested, I slowed the video down by 50% then used an online timer to get the splits). So now you can do some analysis. It's not a particularly interesting data set but it might give a fun context to look at linear relationships.<br />
<br />
<h3>
3 Act Task</h3>
<b><u>
Act 1</u></b> - Watch the movie<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/FKXYpDduQfg/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/FKXYpDduQfg?feature=player_embedded" width="530"></iframe></div>
<div>
<div>
How many walnuts will he be able to crush with his head in 60 seconds? Estimate</div>
<div>
Write an estimate you know is too high. Write an estimate you know is too low.<br />
<br /></div>
</div>
<b><u>
Act 2a</u></b> - Before you show this ask students what information they would like to have.<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/DyNXc4iXZt0/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/DyNXc4iXZt0?feature=player_embedded" width="530"></iframe></div>
<br />
<b><u>
Act 2b</u></b> - Show this video for information with more accessible math<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/tYhpesQ0jzU/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/tYhpesQ0jzU?feature=player_embedded" width="530"></iframe></div>
<br />
<u><b>
Act 2c</b></u> - Show this video for information with even more accessible math<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/tBlBk0Fnbu0/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/tBlBk0Fnbu0?feature=player_embedded" width="530"></iframe></div>
<br />
<b><u>
Act 3</u></b> - Show this video to reveal the answer.<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/xgM2jbUZcc0/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/xgM2jbUZcc0?feature=player_embedded" width="530"></iframe></div>
<div>
<br /></div>
<h3>
Analysis</h3>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRwKAUgadFgvEUKPJTi0jnza18y77iNH5YCnD74UyGI_tLDw7xlxuOV1O4DOIbZWLo1Rku3-RHsd_D65p0UyshA90_G3IldV_jJGzq4WGZezENZRAeBt1TWMHaWkuLulaaenzyfChFQ2fz/s1600/Walnut+Codap.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="832" data-original-width="946" height="175" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRwKAUgadFgvEUKPJTi0jnza18y77iNH5YCnD74UyGI_tLDw7xlxuOV1O4DOIbZWLo1Rku3-RHsd_D65p0UyshA90_G3IldV_jJGzq4WGZezENZRAeBt1TWMHaWkuLulaaenzyfChFQ2fz/s200/Walnut+Codap.png" width="200" /></a>I guess the question that most comes to my mind (after "does he have a headache") is he crushing the walnuts at a constant rate. Careful observation might find a couple of spots where he hesitates a bit and you might want to discuss whether that shows up in the data. But is the data linear? Looking at the graph you can see that for the most part it is, but there is a slightly faster rate at the beginning and a slightly slower at the end but each section seems pretty linear.<br />
<br />
Another thing you might want to discuss is whether it should be Time vs Walnuts or Walnuts vs Time. Since rates are usually per unit time then it probably makes sense to do Walnuts vs Time but you could argue that the total time depends on the number of walnuts or that the total number of walnuts you could crush depends on how much time you have. Note that the easiest way to swap the axes in a Google spreadsheet is by changing the position of the columns so to do that I just copied the Time column to both sides of the Walnuts column.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJud0fj7afnDfw_SWWvRrn77TU2lE4uhdFQKeFgYjhG4tLs2AN6mTKgsZUbT76GFBrKs31uyc0x7V8cGuodfYJsoJ923aKiexUi6RNMDWCRkz3bKpn6XWD5RWcdFgeKMO2KTXghgE-ldgx/s1600/WalnutSplit.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="486" data-original-width="1600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJud0fj7afnDfw_SWWvRrn77TU2lE4uhdFQKeFgYjhG4tLs2AN6mTKgsZUbT76GFBrKs31uyc0x7V8cGuodfYJsoJ923aKiexUi6RNMDWCRkz3bKpn6XWD5RWcdFgeKMO2KTXghgE-ldgx/s400/WalnutSplit.png" width="560" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<h3>
Sample Questions</h3>
<div>
Besides the above questions you could certainly ask:</div>
<div>
<ul>
<li>What's the line of best fit?</li>
<li>What's the correlation?</li>
<li>How many walnuts do you think he could crush if it were two minutes? 10 minutes?</li>
<li>Is there a better fit than linear?</li>
<li>How many nuts would he have cracked if he kept at the same pace as the first 10 seconds?</li>
<li>If you only saw the first 5 seconds, what would be your prediction of the number crushed in 1 minute?</li>
<li>Can you tell, on the graph, when he hesitated?</li>
<li>What if he would have had the pace he finished with throughout the whole minute, how many nuts would he have cracked then? I think <a href="https://www.youtube.com/watch?v=bqJ_HIuJpsw" target="_blank">this was</a> the previous <a href="http://www.guinnessworldrecords.com/world-records/most-walnuts-cracked-against-the-head-in-one-minute" target="_blank">record of 281</a>. </li>
</ul>
</div>
<h3>
Download the Data</h3>
<ul>
<li>Original video: <a href="https://youtu.be/i1PQX64cTgY" target="_blank">https://youtu.be/i1PQX64cTgY</a></li>
<li>Google Sheets <a href="https://drive.google.com/open?id=1rH8ZlXPAQ-iNsaBMJC9s-q12MrNdtJiXzfSI5l9OAG0" target="_blank">Version</a></li>
<li>CODAP <a href="https://drive.google.com/open?id=1sAHYDR29ImBEcO7A2x1taMi7uDu6V94U" target="_blank">Version</a></li>
<li>Comma Separated <a href="https://drive.google.com/open?id=1ooqbw7rXjDGWZ9YkQ5RAA7DSXM9Mf6kb6_LkvMbQcPY" target="_blank">Version</a></li>
<li>Desmos <a href="https://www.desmos.com/calculator/ikkoyredzk" target="_blank">Version</a></li>
</ul>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span><br />
<br />David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-65190696209200754392018-08-26T19:36:00.003-07:002019-04-01T08:23:27.517-07:00Using the CODAP Online Statistics Software for Simple Analysis<div class="separator" style="clear: both; text-align: center;">
<a href="https://codap.concord.org/wp-content/themes/cc/img/codap-logo.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="160" data-original-width="377" height="135" src="https://codap.concord.org/wp-content/themes/cc/img/codap-logo.png" width="320" /></a></div>
So for years I have been a user of Fathom. Fathom is a dynamic statistical software package that has been available for teachers and students, free, here in Ontario. However, the software itself has not been updated over time and currently won't even run on a relatively recently purchased Mac. Not to fear, some of the creators of Fathom have come together to create the <a href="https://codap.concord.org/" target="_blank">Common Online Data Analysis Platform</a> (CODAP).<br />
<br />
And because it was created by the people who gave us Fathom, it has a lot of similarities in style and function. It's not quite exactly the same but the biggest advantage is that it resides online so you can assign data for students to analyze and they can do so on any platform (probably not on a small screen phone very easily but still technically possible).<br />
But for simple analysis, it does almost all the same things that Fathom did. Categorical and numerical analysis, mean & median, dot plots, scatter plots, linear regression, moveable lines, sum of squares, box plot, outliers and more. Some things it doesn't do (yet) are make bar graphs (though it makes the equivalent with dot plots) and histograms (though this may become an added feature). You can watch how easy it is to do some of those things dealing with simple analysis on the video seen below. If you want to play along with the video, <a href="https://drive.google.com/open?id=1fdpuxHr8_8CEbsd0mc_yhwy6IUdNjF-Y" target="_blank">here</a> is the file that I used.<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/Dm7DFi6Iums/0.jpg" frameborder="0" height="315" src="https://www.youtube.com/embed/Dm7DFi6Iums?feature=player_embedded" width="530"></iframe></div>
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjm8UFPgl5ODD3fDLxR8V5gje0e99nDyie8opA7-f5S1OMb9mwqXufJtod4D48kgiW9xGS5doxN6tYTVOQF_6nlOn7LcFxNTYre3TA9w0Gj4crT3_hNHVDMBkrlnYcval4UpR6p88xYT-kD/s1600/CODAP+Download.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="503" data-original-width="924" height="217" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjm8UFPgl5ODD3fDLxR8V5gje0e99nDyie8opA7-f5S1OMb9mwqXufJtod4D48kgiW9xGS5doxN6tYTVOQF_6nlOn7LcFxNTYre3TA9w0Gj4crT3_hNHVDMBkrlnYcval4UpR6p88xYT-kD/s400/CODAP+Download.png" width="400" /></a>Once you know how to use the app, getting the data to your students is the next step. My preference is to have a pre-made CODAP file available for upload to CODAP. You can upload a file directly from any computer or conversely from a Google Drive. My preference is to do so from a Google drive. I have taken the liberty of converting many of the data sets on this blog to CODAP files. I have tagged all of them with the <a href="http://found-data.blogspot.com/search/label/CODAP" target="_blank">CODAP label here</a> (also seen on the right side of the blog) or I have collected all the CODAP files in <a href="https://drive.google.com/open?id=1XktX4O57rFYZVC6_MA-n52j22xYReQpe" target="_blank">this folder</a>. Conversely you can upload your own data in a .csv file. Though it does not seem like you can do this directly from a Google Drive. So I would stick to creating the CODAP files and sharing that with your students (either on Google drive or a local network drive). Either way, if you use any of these files, I would download them from this blog and then upload them to your preferred place.<br />
<br />
And being redundant, here is a list of the past posts that I have done the conversion for and future posts will also have CODAP versions included.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilNdeXvR9T3IXkbUY9QiEZ5E6Rab3tnuDTOT22o3kH2EFLDSt-fmgDNd2NkxAfJgSph_f_7eBHtj-0PQi5PGhWaLHnyNluvR-k5LM3mnqr8sjCHyc4wUIt6-U9xD9_QLHs2aPwXsXdq_ag/s1600/Codap.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="836" data-original-width="1600" height="166" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilNdeXvR9T3IXkbUY9QiEZ5E6Rab3tnuDTOT22o3kH2EFLDSt-fmgDNd2NkxAfJgSph_f_7eBHtj-0PQi5PGhWaLHnyNluvR-k5LM3mnqr8sjCHyc4wUIt6-U9xD9_QLHs2aPwXsXdq_ag/s320/Codap.png" width="320" /></a></div>
<a href="http://found-data.blogspot.com/2015/11/anscombes-quartet.html" target="_blank">Anscombe's Quartet</a><br />
<a href="http://found-data.blogspot.com/2015/12/smoking-and-cancer.html" target="_blank">Smoking and Cancer</a><br />
<a href="http://found-data.blogspot.com/2015/12/movie-data.html" target="_blank">Movie Data</a><br />
<a href="http://found-data.blogspot.com/2015/12/how-much-would-you-pay-for-50-gift-card.html" target="_blank">How Much Would you Pay for a $50 Gift Card?</a><br />
<a href="http://found-data.blogspot.com/2016/01/earthquake-database.html" target="_blank">Earthquake Data</a><br />
<a href="http://found-data.blogspot.com/2016/01/trending-data.html" target="_blank">Trending Data</a><br />
<a href="http://found-data.blogspot.com/2016/01/magazines.html" target="_blank">Magazines</a><br />
<a href="http://found-data.blogspot.com/2016/03/speed-data.html" target="_blank">Speed Data</a><br />
<a href="http://found-data.blogspot.com/2016/06/electric-car-rebates.html" target="_blank">Electric Car Rebates</a><br />
<a href="http://found-data.blogspot.com/2016/07/is-levelling-up-in-pokemon-go.html" target="_blank">Is Levelling Up in Pokemon Go Exponential</a><br />
<a href="http://found-data.blogspot.com/2016/09/collecting-data-from-pokemon-go.html" target="_blank">Collecting Data from Pokemon Go</a><br />
<br />
Don't forget to look at the <a href="https://codap.concord.org/" target="_blank">CODAP site </a>for lots of great resources. From more <a href="https://concord-consortium.github.io/codap-data/" target="_blank">data sets</a>, <a href="https://codap.concord.org/help/how-to" target="_blank">tutorials</a>, <a href="https://codap.concord.org/help/faq" target="_blank">FAQs</a> and even though we haven't talked about them here, <a href="https://concord.org/wp-content/uploads/2016/12/codap/embed/dsg.html" target="_blank">simulations</a>. Or just look at the <a href="https://codap.concord.org/for-educators/" target="_blank">Educator Resources</a> page.<br />
<h3>
Download the Data</h3>
<a href="http://found-data.blogspot.com/search/label/CODAP" target="_blank">All the Posts</a><br />
Folder of <a href="https://drive.google.com/open?id=1XktX4O57rFYZVC6_MA-n52j22xYReQpe" target="_blank">CODAP files</a><br />
<br />
<br />David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-58708479408937633852017-06-09T09:30:00.002-07:002021-05-16T10:46:28.955-07:00Five Thirty Eight's Pile of Data<a href="https://espnfivethirtyeight.files.wordpress.com/2014/07/hickey-feature-classicrock-41.png?quality=90&strip=info&w=575&ssl=1" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="248" src="https://espnfivethirtyeight.files.wordpress.com/2014/07/hickey-feature-classicrock-41.png?quality=90&strip=info&w=575&ssl=1" width="320" /></a><br />
<b><span style="color: red;">UPDATE</span></b>: Now even more of their data is available and easier to get at, you guessed it, their data site: <a href="https://data.fivethirtyeight.com/" target="_blank">https://data.fivethirtyeight.com/</a><br />
<br />
I have always found it tough to find interesting data sets. Especially those that are not contrived. At Five Thirty Eight they are constantly looking at the world through data. Their primary posts tend to be about politics or sports but often they have posts on pop culture and other items. For example, recently they had a post titled "<a href="https://fivethirtyeight.com/features/why-classic-rock-isnt-what-it-used-to-be/" target="_blank">Why Classic Rock Isn't What it used to be</a>". In that post they analyzed over 37000 plays of classic rock songs spanning decades. And not only have they done the work, they've made all of the <a href="https://github.com/fivethirtyeight/data/tree/master/classic-rock" target="_blank">raw data available</a>. All 37673 pieces in a csv file.<br />
<h3>
Downloading the Data</h3>
So basically they have a <a href="https://github.com/fivethirtyeight/data" target="_blank">Github site</a> where they make much of the raw data available for many of their stories. They have a lot of data related stories and although most of them are not on this site there are almost 100 that are. So for example, you could look at the article about how <a href="http://fivethirtyeight.com/features/avengers-death-comics-age-of-ultron" target="_blank">deadly it is to be an Avenger</a> and see that the article doesn't have any graphs but there is a <a href="https://github.com/fivethirtyeight/data/tree/master/avengers" target="_blank">bunch of data</a> where you could do a histogram or something with the categorical data.<br />
<br />
Or if you were a <a href="https://www.youtube.com/watch?v=oh5p5f5_-7A" target="_blank">Bob Ross</a> Fan (real or ironic) then you can get the data the analyzed on the paintings he created for his show. Here's <a href="https://fivethirtyeight.com/features/a-statistical-analysis-of-the-work-of-bob-ross/" target="_blank">the article</a>, but on the <a href="https://github.com/fivethirtyeight/data/tree/master/bob-ross" target="_blank">GitHub site</a> you get the raw data plus, as an added bonus for you code jockeys, the Python script that they used to create the data set. Most have the link to the original article.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7Z2a53P21_6HK6xFUE1BGtysWMBCcmgu1XV7jAxn9pVnrOmzsdENl3zUphjpq0QnewK8juYC54_dCmhumZUWfFPfn4lbxSFh8MCRzSZ2H08h4ZLmnxkerxVGSP4IIko3iJ6BcUntlknez/s1600/Bob+Ross.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7Z2a53P21_6HK6xFUE1BGtysWMBCcmgu1XV7jAxn9pVnrOmzsdENl3zUphjpq0QnewK8juYC54_dCmhumZUWfFPfn4lbxSFh8MCRzSZ2H08h4ZLmnxkerxVGSP4IIko3iJ6BcUntlknez/s640/Bob+Ross.png" width="560" /></a></div>
<a href="https://espnfivethirtyeight.files.wordpress.com/2015/06/flowers-datalab-unisexnames-1.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="800" data-original-width="728" height="200" src="https://espnfivethirtyeight.files.wordpress.com/2015/06/flowers-datalab-unisexnames-1.png" width="181" /></a>Note that when you see the CSV file listed, you can't just right click and download the file. That will just get you the script used to get the data. To get the actual data, click the CSV link and then copy the data from the table that appears.<br />
<br />
Some other interesting sets are on <a href="https://github.com/fivethirtyeight/data/tree/master/fandango" target="_blank">Fandango's movie ratings</a>, or the connections between the actors in the movie <a href="https://github.com/fivethirtyeight/data/tree/master/love-actually" target="_blank">Love Actually</a> or their data on the popularity of <a href="https://github.com/fivethirtyeight/data/tree/master/unisex-names" target="_blank">unisex names</a>.<br />
<br />
One small warning. This is raw data and in a few cases really raw. For example the data set about the number times someone cursed or bled out in a Quentin Tarantino movie is very cool but totally inappropriate for a classroom (there are 1895 pieces of data in this set).<br />
<br />
Check them all out on the sites:<br />
<a href="https://data.fivethirtyeight.com/" target="_blank">https://data.fivethirtyeight.com/</a><br />
<a href="https://github.com/fivethirtyeight/data" target="_blank">https://github.com/fivethirtyeight/data</a><br />
<br />David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-6239630337359545302016-09-17T22:12:00.001-07:002018-08-26T17:43:58.074-07:00Collecting Data from Pokemon Go<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNeKEjcknpRzXae0VKQXooZZGedsOT0c5tu6exkTbbJYKl6wzBFJ9OTGkfbeYpf94cnQkTmC_i2YhQQwLXWdrf3Ow-kAQMJS2Ma0ceZuCHAryNPaXPVEjiWG32hdMuklNjzcMrqGHcC6f6/s1600/PokemonGoData.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNeKEjcknpRzXae0VKQXooZZGedsOT0c5tu6exkTbbJYKl6wzBFJ9OTGkfbeYpf94cnQkTmC_i2YhQQwLXWdrf3Ow-kAQMJS2Ma0ceZuCHAryNPaXPVEjiWG32hdMuklNjzcMrqGHcC6f6/s320/PokemonGoData.png" width="235" /></a>It's the beginning of the school year now and the dust is starting to settle from the summer's obsession with Pokemon Go. So why not try to leverage that obsession by having students collect some data. The data comes in the form of how many times each Pokemon was seen and caught by each user. I got the idea for this set of data from <a href="https://lesliefarooq.wordpress.com/2016/08/10/pokemon-go-math/" target="_blank">this post</a> from <a href="https://twitter.com/lesliefarooq" target="_blank">@lesliefarooq</a> where she pointed out that with each Pokemon caught, when you look in the Pokedex, there is data about how many times each Pokemon was both seen and caught. At first glance this is a simple data set but it turns out there is a lot you could do with it.<br />
<br />
So what I was able to do was start to collect some of that data by using a Google Form to generate two types of graphs. The first was a graph of the most often seen Pokemon (no surprise to players what the top three were). The second graph was the linear relationship between the number of caught and the number seen. What follows are the ways that you can either use my data or collect your own with your students.<br />
<h2>
Analysis</h2>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYGHQEQqBMXkLASwdkYtluDMk75fDTpF5s51hEgvIX0V9SSHEmXTyWVqyp1iPoxRmfRnKok0dd8mKI8NsCQ6-4-wkhdEbMZyZg7RzRsNcyTfl_9QOe3dn1DiTdnVbV-TCYcMO7njcfEjcY/s1600/PokemonInsructions.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="189" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYGHQEQqBMXkLASwdkYtluDMk75fDTpF5s51hEgvIX0V9SSHEmXTyWVqyp1iPoxRmfRnKok0dd8mKI8NsCQ6-4-wkhdEbMZyZg7RzRsNcyTfl_9QOe3dn1DiTdnVbV-TCYcMO7njcfEjcY/s320/PokemonInsructions.png" width="320" /></a></div>
So the first thing you need to do is get the data. Once in the game, tap on the Pokeball at the bottom of the screen, then the Pokedex and then tap on any Pokemon that shows up. Once you get to the Pokemon screen you can collect the Pokemon number, the name is optional (to make entry into the form quicker, I only required the number), how many they saw, how many they caught and finally the type of Pokemon. Here you will get the data on each Pokemon. Swiping left or right will cycle between each Pokemon so you can collect the data faster. So if you have students that have been playing the game, they can collect the data there. You might want them to collect it manually or they can use <a href="https://docs.google.com/forms/d/e/1FAIpQLSfYWByu7hA1HS9TyGoeZYgd1fI6H6qZPaN_vka_ySrByaEIGQ/viewform" target="_blank">this form</a> to add to my data electronically or you can make a copy of <a href="https://docs.google.com/forms/d/1fgu1g280snjCUpndO0WXOf0pQNgfwuKUgs_rx-6Cf28/copy?usp=sharing" target="_blank">this form</a> to create your own class set.<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0DJQxOrB4M1AWj34oTw-D7ZiGcWt8FScjbIuxQNpZe0ETAzyopu1LrNhOmaXtNlXCOULCmA3f7EFyX-ljvJEPDj46m5oZs8JtCWLsmNv0Iy4Bk3hmObcATkOo4NNT4aI2BLUSJe2twjp3/s1600/PokemonGo-1.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="232" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0DJQxOrB4M1AWj34oTw-D7ZiGcWt8FScjbIuxQNpZe0ETAzyopu1LrNhOmaXtNlXCOULCmA3f7EFyX-ljvJEPDj46m5oZs8JtCWLsmNv0Iy4Bk3hmObcATkOo4NNT4aI2BLUSJe2twjp3/s320/PokemonGo-1.png" width="320" /></a><br />
Once you have the data, the first thing that you can have students do is create a bar graph of their most popular Pokemon like @lesliefarooq did. What I did is took that a step further. Since I collected the data via a Google form, I used a bit of spreadsheet wizardry to tally up the total number of Pokemon of each type seen given all the data. You can see that <a href="https://docs.google.com/spreadsheets/d/1oZwN85z99IXLFhLCkBXNJhPvcmOcvgpyMZMMTkhcsn8/edit?usp=sharing" target="_blank">in my data</a> sheet where I have added some columns to the right where the data is collected. The nice thing about this is that as more people add their data to my form, it will continue to update the totals. So with this data you can do some of the same thing that @lesliefarooq did and ask students about their most popular Pokemon and compare <a href="http://www.pokemongodb.net/2016/05/pokemon-go-pokedex.html" target="_blank">to the graphic</a> that shows how popular or rare each Pokemon is.<br />
<br />
But the nice thing about this data is that you can now use the connection between the sightings and catches to connect to linear relationships. It's not a perfectly linear relationship but it will have a very strong correlation.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy0mPewFSJp4X00jRvTHOrzSaVTjNU3LlJSrARkwVXbhaahiW33ECABXWwa-azkcgoxXZVLaY_bsFTfSa8E8u9KHJT8u2VICUmWykjPiwpTLvbLHwvHR6wp6GGthuwrJSogNIm3iI27SEj/s1600/PokemonGo-2.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy0mPewFSJp4X00jRvTHOrzSaVTjNU3LlJSrARkwVXbhaahiW33ECABXWwa-azkcgoxXZVLaY_bsFTfSa8E8u9KHJT8u2VICUmWykjPiwpTLvbLHwvHR6wp6GGthuwrJSogNIm3iI27SEj/s320/PokemonGo-2.png" width="320" /></a>NOTE: In the actual game, players will collect Pokemon in two ways. The main way is by having them appear and then catching them by throwing Pokeballs at them. Most Pokemon will be caught this way. The second way is to hatch eggs. And the only way to hatch an egg is to physically walk 2km, 5km or 10km (that is one of the physical activities that the game promotes). When you hatch an egg, they are often more rare Pokemon that you will never see "in the wild". So these will always be seen once and caught once. This means that if you do any linear regression, you will have a large number of data that will be (1, 1) and that will skew your regression making it stronger. So I suggest removing any of those data pieces. In the set that I give as a sample, I have already done that (see below).<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiG2VqYSO90KZW-bDH1FIjoxdO1SIMeOhpSBZ-0L6Yq7rCA47lCnHXSIkJR_wpFLtMKo_vjKM8XkJSZXg30YWd3GiIlmvioqaD1R_fnvgA1aDdBBbrjEd3DI7AsoK4OwlehXGhInEu4kdLa/s1600/PokemonGoDesmos.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="168" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiG2VqYSO90KZW-bDH1FIjoxdO1SIMeOhpSBZ-0L6Yq7rCA47lCnHXSIkJR_wpFLtMKo_vjKM8XkJSZXg30YWd3GiIlmvioqaD1R_fnvgA1aDdBBbrjEd3DI7AsoK4OwlehXGhInEu4kdLa/s320/PokemonGoDesmos.png" width="320" /></a>So this data set will be good for introductory linear relations with interpolation and extrapolation but what I have also done is extract some of the data into smaller sets. Because when we collected the data we also asked about the Pokemon number and Pokemon type. So this means we can start to use that info. For example, we can break up the big set into smaller sets, each corresponding to a different Pokemon. To facilitate that, I have created both a Fathom file and a <a href="https://teacher.desmos.com/activitybuilder/custom/57d014016070d2050610a577" target="_blank">Desmos Activity</a> with these smaller sets (try it out <a href="https://student.desmos.com/?prepopulateCode=qy5f" target="_blank">here</a>). The Desmos file, as it is set up, would be good for beginners when it comes to interpolation and extrapolation but it could be augmented for further exploration of lines of best fit. The Fathom file would be good for comparison of lines of best fit for the data sets. In the original data set you can also do things comparing the types of Pokemon as well.<br />
<h2>
Sample Questions</h2>
<ul>
<li>How does your top 20 most popular Pokemon compare to the top 20 of the larger set?</li>
<li>How does the number of each type of Pokemon compare to each other?</li>
<li>Which Pokemon has the highest number of average catches?</li>
<li>Which Pokemon is easier to catch, based on the data?</li>
<li>How does the linearity of the data relate to how easy the Pokemon could be caught?</li>
<li>Which type of Pokemon is easier to catch? Which one has the largest correlation?</li>
</ul>
<h2>
Download the Data</h2>
<ul>
<li><a href="https://docs.google.com/forms/d/e/1FAIpQLSfYWByu7hA1HS9TyGoeZYgd1fI6H6qZPaN_vka_ySrByaEIGQ/viewform" target="_blank">Form</a> to add to this data set. <a href="https://docs.google.com/spreadsheets/d/1oZwN85z99IXLFhLCkBXNJhPvcmOcvgpyMZMMTkhcsn8/edit?usp=sharing" target="_blank">Google sheet</a> with the large data set, comparison of the Pokemon caught (on the second tab)</li>
<li><a href="https://docs.google.com/forms/d/1fgu1g280snjCUpndO0WXOf0pQNgfwuKUgs_rx-6Cf28/copy?usp=sharing" target="_blank">Form</a> to create your own data set (make a copy)</li>
<li>Fathom File with <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GRXRUaWloOFdkMlE" target="_blank">large data set</a> and with a <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GTWtmYXRxMV9PNFk" target="_blank">few graphs</a></li>
<li>CODAP <a href="https://drive.google.com/open?id=1-iqK7mrHEtspAkdH673wrqycrVPflB5U" target="_blank">large data set</a></li>
<li><a href="https://teacher.desmos.com/activitybuilder/custom/57d014016070d2050610a577" target="_blank">Desmos Activity</a></li>
<li>Original <a href="https://lesliefarooq.wordpress.com/2016/08/10/pokemon-go-math/" target="_blank">post</a></li>
<li>Pokemon popularity <a href="http://www.forbes.com/sites/davidthier/2016/07/18/the-rarest-pokemon-you-can-find-in-pokemon-go/#7c35a22e3ffd" target="_blank">data</a></li>
</ul>
<div class="post-body entry-content" id="post-body-5765260509595798464" itemprop="description articleBody" style="-webkit-text-stroke-width: 0px; background-color: white; color: black; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; line-height: 1.4; orphans: 2; position: relative; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; width: 586px; word-spacing: 0px;">
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 14.56px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span><br />
<div style="clear: both;">
</div>
</div>
David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-86263074313314252582016-07-27T20:47:00.000-07:002018-08-26T17:14:09.986-07:00Is Levelling Up in Pokemon Go Exponential?<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZxMyOhD60XLv17oA46vS82tlRVkxENKk7lz6nf8gXMIMA2wuJ09qCKCHZpn_ICqm1ayCqxAnDqriYHc543p0fzHzjkFELL0gytBsORatLPuzd-bboptEuDpmQw5dJXuVnVqAn8sjbqnog/s1600/PokemonGo.PNG" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZxMyOhD60XLv17oA46vS82tlRVkxENKk7lz6nf8gXMIMA2wuJ09qCKCHZpn_ICqm1ayCqxAnDqriYHc543p0fzHzjkFELL0gytBsORatLPuzd-bboptEuDpmQw5dJXuVnVqAn8sjbqnog/s200/PokemonGo.PNG" width="112" /></a>Unless you have been living under a rock over the last few weeks, you've probably heard of Pokemon Go. If you are not aware, the general premiss is that you wander your neighbourhood (physically) with the App open. The app is linked to GPS and Google maps so as you walk around you see your streets but overlying those streets are various Pokemon characters to capture and along the way you collect points by visiting PokeStops (to also collect items) and PokeGyms (to also have battles). Along the way you "Level Up" by accumulating experience (XP) points. As you increase your level, the number of points needed to go to the next level also increases. But how? Is it linear, quadratic, exponential or something else? Well, get the data and have your students decide.<br />
<h2>
Analysis</h2>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh79Lx5W9lwJmawINU2W1TeC3Uq6BHC_JHom1PUEkbxAZENagOX2Qa0fn4Z2Drw050foy05EUCehouBoItBbhm0JPKN7YE6YKyrulepZFc2_BokErDHKsGZ-H5r72da8R9e5eDE9wtxYee-/s1600/PokemonData.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh79Lx5W9lwJmawINU2W1TeC3Uq6BHC_JHom1PUEkbxAZENagOX2Qa0fn4Z2Drw050foy05EUCehouBoItBbhm0JPKN7YE6YKyrulepZFc2_BokErDHKsGZ-H5r72da8R9e5eDE9wtxYee-/s320/PokemonData.png" width="208" /></a>As far as anyone knows (right now) there are only 40 levels. To move past the 1st level you need to accumulate 1000 pts but by <a href="http://heavy.com/games/2016/07/pokemon-go-xp-gained-per-level-how-much-needed-need-to-up-advance-chart-graph-image/" target="_blank">level 40 you need five million</a>. So the question might be "How does the number of points change as you go from level to level?".<br />
<br />
As players are in the game, they will level up. What they will see is the number of points needed to get to the next level (not the total number of points accumulated). The first 15 levels can be seen to the right. The middle column shows the total number of points at the beginning of each level (constructed from the points needed to level up for each level). The right most column indicates how many points are needed in each level to get to the next level (this is what players would actually see). It is essentially the 1st difference of the total points. But to clarify, players never see the Total number of XP in the game. It was just constructed here because that is usually what we would be graphing. So to keep your street cred with the kids, you may want to only refer to the XP needed at each level and construct the total (like I did) for mathematical purposes.<br />
<br />
Regardless, this is one of the first places you can have students do some analysis. By looking at the points need to level up you can see that as you go from level to level, the number of points needed goes up 1000 pts per level until level 11 where it starts to stabilize for a few levels.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEga6ErhynHhP2EO_hJTxv2j6PbDop0raXL-XS8MCLdqN-x-Xa99vIyna4pHZI4Opu9P0uzXrH2YcG6vSwm-8OZZTogqfNZ0B4Ld_Fkrceh3BBhJiyVVoHOWrEw2DafqJMZhHIiQxe3owTY0/s1600/PokemonGo-All.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="95" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEga6ErhynHhP2EO_hJTxv2j6PbDop0raXL-XS8MCLdqN-x-Xa99vIyna4pHZI4Opu9P0uzXrH2YcG6vSwm-8OZZTogqfNZ0B4Ld_Fkrceh3BBhJiyVVoHOWrEw2DafqJMZhHIiQxe3owTY0/s200/PokemonGo-All.png" width="200" /></a>As you look at all the levels there are a couple of ways you can look at it. By plotting all 40 levels you can see that an <a href="https://www.desmos.com/calculator/jdtxly0lu5" target="_blank">exponential model</a> is almost a perfect fit with a geometric progression of little more than 25% each time you level up, though not exactly. A different view could be by putting the levels in groups of 5. Doing this shows that as you go up levels you need significantly more XP points to get to the next group of levels.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_YWb4o9BH7VzKcntctDnwS-XHEyrhDcvdki4IiH215T245eELpCsYQLq1fQBvZxLqM3VJLj57my_28Jx8OJgaKCYY1uG_4ouQDE5zUas2Rw0VsGo9uz76tfgTeSOE3wJcFBTnx00gn_EJ/s1600/PokemonGoGraph.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_YWb4o9BH7VzKcntctDnwS-XHEyrhDcvdki4IiH215T245eELpCsYQLq1fQBvZxLqM3VJLj57my_28Jx8OJgaKCYY1uG_4ouQDE5zUas2Rw0VsGo9uz76tfgTeSOE3wJcFBTnx00gn_EJ/s640/PokemonGoGraph.png" width="560" /></a></div>
<br />
<div style="text-align: center;">
</div>
<div style="text-align: left;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyHlC84x4xwxIHzYhsKvDwHKstmisHjhZUcIvVOmBwwVTI2v76AihGjE314Byi3sE78S_TJDsJxaVEyxbpbYpAq5zk8KzZk2KcMr9aocmNBmPq6boCPoIhkxd7WQf9cWH0mvA-lL4CwwKp/s1600/PokemonGo-First14.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="95" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyHlC84x4xwxIHzYhsKvDwHKstmisHjhZUcIvVOmBwwVTI2v76AihGjE314Byi3sE78S_TJDsJxaVEyxbpbYpAq5zk8KzZk2KcMr9aocmNBmPq6boCPoIhkxd7WQf9cWH0mvA-lL4CwwKp/s200/PokemonGo-First14.png" width="200" /></a>But a closer look at the data shows that the first 11 levels have a constant 2nd difference and thus are quadratic. And then the next few levels have <a href="https://www.desmos.com/calculator/vi657mzuqx" target="_blank">constant first differences</a> and thus go up linearly. After that the increases are not as consistent. </div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
So there are many places in the curriculum that this data set can relate to. On the simple end you can look at it as a non linear data set. Or you can just focus on the first few levels and keep it quadratic or contrast that with the linear portion. The fact that we are talking about discrete levels means that you can think about this in terms of sequences and series. So take from it what you need. Below are some possible prompts you can use with students and the entire set can be downloaded from <a href="https://docs.google.com/spreadsheets/d/1vDYAC7i8m8i0Csuglj1_yNTQzrYxcPZ0O3IfjNucGgg/edit?usp=sharing" target="_blank">this Google Doc</a> for easy consumption.</div>
<h2>
Sample Questions</h2>
<ul>
<li>If it took you one day to get to level 5, how long would it take you to get to level 10? Level 15? Level 40?</li>
<li>What type of relationship exists between the points for each level in the first 10 levels? 15 levels? all levels?</li>
<li>Do the levels follow a constant sequence?</li>
</ul>
<h2>
Download the Data</h2>
<ul>
<li>Original <a href="http://heavy.com/games/2016/07/pokemon-go-xp-gained-per-level-how-much-needed-need-to-up-advance-chart-graph-image/" target="_blank">Data set</a> </li>
<li>Google Sheet <a href="https://docs.google.com/spreadsheets/d/1vDYAC7i8m8i0Csuglj1_yNTQzrYxcPZ0O3IfjNucGgg/edit?usp=sharing" target="_blank">version</a> Google sheets <a href="https://docs.google.com/spreadsheets/d/1IbViV9CyjA5szmjcPF9FUXjYIn585VWppaHJQUUkQcA/edit?usp=sharing" target="_blank">with graphs</a> CODAP f<a href="https://drive.google.com/open?id=1WpOS08a5miKro4RXPxebJiied8Z9HWh-" target="_blank">ile</a></li>
<li>Desmos Exponential <a href="https://www.desmos.com/calculator/jdtxly0lu5" target="_blank">Regression</a></li>
<li>Desmos First <a href="https://www.desmos.com/calculator/vi657mzuqx" target="_blank">14 Levels</a></li>
</ul>
<div class="post-body entry-content" id="post-body-5765260509595798464" itemprop="description articleBody" style="-webkit-text-stroke-width: 0px; background-color: white; color: black; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 18.2px; orphans: auto; position: relative; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 1; width: 586px; word-spacing: 0px;">
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 14.56px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span><br />
<div style="clear: both;">
</div>
</div>
David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-57652605095957984642016-06-06T11:57:00.001-07:002021-12-06T11:32:54.797-08:00Electric Car RebatesSo <a href="http://www.cbc.ca/news/canada/toronto/porsche-918-spyder-ontario-rebate-1.3551689" target="_blank">this article</a> came across my Facebook feed a while back and I though it was a great potential source of data for discussion at many levels<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigXGNo5AwBLhCzoOLhqtUjxPgH6tbbS5gHWxHILKJGgvU2mV5sy82nXVDbgzQ2t9Oh8aauwu4TTUGWZUd7spZKsUKnRJgSH8wMj5xwfkFW_1faxRvhlp3K47z-rfEGAeaquZwgs84Z7ILJ/s1600/ElectricCars.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigXGNo5AwBLhCzoOLhqtUjxPgH6tbbS5gHWxHILKJGgvU2mV5sy82nXVDbgzQ2t9Oh8aauwu4TTUGWZUd7spZKsUKnRJgSH8wMj5xwfkFW_1faxRvhlp3K47z-rfEGAeaquZwgs84Z7ILJ/s640/ElectricCars.png" width="560" /></a></div>
It certainly captured my attention as an Ontario resident but a closer look showed that there was potentially a lot of data to be analyzed. The data is about the <a href="http://www.mto.gov.on.ca/english/vehicles/electric/electric-vehicle-incentive-program.shtml" target="_blank">Ontario Electric Vehicle Incentive</a> program and the above article was inspired by <a href="https://news.ontario.ca/opo/en/2016/02/ontario-making-electric-vehicles-more-affordable.html" target="_blank">this news release</a> but in the article they were able to get more specific data about number of vehicles of each style (which is not released).<br />
<h2>
Analysis</h2>
Students are encouraged to look critically at the original article and perhaps talk about how the title and some of the information given is used to incite a reaction.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggcWIG761W27w8OeAhUKGwBEWIUhn5oAhAZ4tmpZFG_-1JRSRkalMRKKVM9YyYNm_3sA79YRdoiIHunH0mQ72tEJC1gmwgA-3mNn1r3xFPqlARrQnVZcL0umzxqLyGDLPDfhA85LmsbM58/s1600/ElectricCars2.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggcWIG761W27w8OeAhUKGwBEWIUhn5oAhAZ4tmpZFG_-1JRSRkalMRKKVM9YyYNm_3sA79YRdoiIHunH0mQ72tEJC1gmwgA-3mNn1r3xFPqlARrQnVZcL0umzxqLyGDLPDfhA85LmsbM58/s640/ElectricCars2.png" width="560" /></a></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikjbdsSfg0qtgRb8KoYwTrLEbxpkGs-9RZlN9JLxAeeMdlc3N75abaXo8KSXbCoJZGG3WsaFPL8I6LG0P5hEXpJTju-sxb0_pm0p503_VGvG79hap8Jho9UmYrXoG2n2VcwJ4xbJK8ZJfW/s1600/ElectricCars3.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="223" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikjbdsSfg0qtgRb8KoYwTrLEbxpkGs-9RZlN9JLxAeeMdlc3N75abaXo8KSXbCoJZGG3WsaFPL8I6LG0P5hEXpJTju-sxb0_pm0p503_VGvG79hap8Jho9UmYrXoG2n2VcwJ4xbJK8ZJfW/s320/ElectricCars3.png" width="320" /></a>For example even though they gave the overall numbers of almost 4800 people getting around $39 million in rebates, they focused on just the rebates of the most expensive cars which total about 2% of the people and rebate value. And although they do mention it, it's not highlighted but about 25% of those rebates went to one vehicle, the Chevrolet Volt.<br />
But looking at the ministry website you can see a n<a href="http://www.mto.gov.on.ca/english/vehicles/electric/electric-vehicle-rebate.shtml" target="_blank">ice data set about which cars</a> get which rebates (as well as info about how the program changed once it was pointed out that super expensive luxury cars were getting rebates.<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-U1dI7VAfVVlj5HVcL7IL7J2BGzGADKSTNE06HTL_S6DU1-gZXaD9qbzMrDYbXh9i09bQSQkxP8HaqSScYSzX6PfE_k39VdMl57z8MzEs_LqBD0pzGlna0DKPgvFxRZQpl0hhVOejgBNe/s1600/ElectricCars4.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="148" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-U1dI7VAfVVlj5HVcL7IL7J2BGzGADKSTNE06HTL_S6DU1-gZXaD9qbzMrDYbXh9i09bQSQkxP8HaqSScYSzX6PfE_k39VdMl57z8MzEs_LqBD0pzGlna0DKPgvFxRZQpl0hhVOejgBNe/s320/ElectricCars4.png" width="320" /></a>I was able to get this table out and clean it up as well as add the approximate value of each car to the list (it's approximate because I had to go and search each out on the web so I might have been a bit lazy when it came to options) and now it is good for some simple analysis.<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMar3SPw7cJsQWpx10LTcYpb2qzpwaj_3CF6JDdhwJI4g-KPq1y2D2fn23J9cQlrZsPoty-GBGwoQ-kliHDicgeUbo6dRJtKNDbAZckbxIXlwrwBsnFzSBgz0k670Jjzw2m558xgYlQpv1/s1600/ElectricCars5.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="205" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMar3SPw7cJsQWpx10LTcYpb2qzpwaj_3CF6JDdhwJI4g-KPq1y2D2fn23J9cQlrZsPoty-GBGwoQ-kliHDicgeUbo6dRJtKNDbAZckbxIXlwrwBsnFzSBgz0k670Jjzw2m558xgYlQpv1/s320/ElectricCars5.png" width="320" /></a>On the "low hanging fruit" end you can create the bar graph of the number of models for each company. Personally, I wouldn't have guessed GM to be at the top. But you can also create a histogram of the actual rebate to look at the distribution (or perhaps look at the box plot or dot plot). Lastly you could look at whether there is a connection with the price of the car and how big the rebate is.<br />
<h2>
Sample Questions</h2>
<ul>
<li>Which manufacturer has the most electric models?</li>
<li>What is the most common rebate value?</li>
<li>Does the rebate get bigger (in general) as the price of the car increases?</li>
<li>If you were going to purchase an electric vehicle, which one would benefit the most/least from the rebate program?</li>
</ul>
<h2>
Download the Data</h2>
<ul>
<li>Ontario Electric Cars (<a href="https://docs.google.com/spreadsheets/d/1Nsuc__dk_t8uVXW4z5q5bguD9OmDBxj93syoJi5T3og/edit?usp=sharing" target="_blank">Google Sheet</a>, <a href="https://drive.google.com/file/d/0B0-WU3yuHa7GTm5EdnlFYndydGs/view?usp=sharing&resourcekey=0-AVzM5I0LQkruAzlG-7d5OA" target="_blank">Fathom</a>, <a href="https://drive.google.com/file/d/1goch64xavVgkbvVNmYDSUW_jI3OMsAwc/view?usp=sharing" target="_blank">CODAP</a>)</li>
<li>Original <a href="http://www.cbc.ca/news/canada/toronto/porsche-918-spyder-ontario-rebate-1.3551689" target="_blank">Article</a></li>
<li>Original <a href="https://news.ontario.ca/opo/en/2016/02/ontario-making-electric-vehicles-more-affordable.html" target="_blank">News brief</a></li>
<li>Original <a href="http://www.mto.gov.on.ca/english/vehicles/electric/electric-vehicle-rebate.shtml" target="_blank">Table of rebates</a></li>
</ul>
<span face=""arial" , "tahoma" , "helvetica" , "freesans" , sans-serif" style="background-color: white; font-size: 13px; line-height: 14.56px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-1689834816141520312016-05-24T12:31:00.000-07:002016-05-24T12:31:07.356-07:00Gas Prices in OntarioA friend, Michael Lieff pointed this nice set of data out. It is the price of gas in several Ontario cities going as far back as 1990. This is an interesting data set as the price of gas, in general, increases but you can see that that wasn't always the case (only a few of the cities are shown below).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGfxUITMf1eRyJoooCbWatsECB5SuG_VoITkNzmQ-1esyRUkvpLVeoWtJGYHSvJ4LjuALQwzBh_DIV9r6yQ9vfz5TC0wb21CNLkd4CItf8rCo4OvXa8sc0wBb93TEM6odA3y9Nlple8Ej_/s1600/GasPrices1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGfxUITMf1eRyJoooCbWatsECB5SuG_VoITkNzmQ-1esyRUkvpLVeoWtJGYHSvJ4LjuALQwzBh_DIV9r6yQ9vfz5TC0wb21CNLkd4CItf8rCo4OvXa8sc0wBb93TEM6odA3y9Nlple8Ej_/s640/GasPrices1.png" width="550" /></a></div>
<h2>
Analysis</h2>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9CZAb-QD4qQKkF3fMPIRfx2ZVX7Q3eMTjwnb-oNLU5jVTOpP6K1tL_2LuKgoFdZV7tPWFf_1EWtkfPYLv3sgDaG07KX2Pf6SMnknNvBB3stpai1WzOz8cgGbKIs_c0EmfIEb2DsKbPO70/s1600/GasPrices2.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="124" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9CZAb-QD4qQKkF3fMPIRfx2ZVX7Q3eMTjwnb-oNLU5jVTOpP6K1tL_2LuKgoFdZV7tPWFf_1EWtkfPYLv3sgDaG07KX2Pf6SMnknNvBB3stpai1WzOz8cgGbKIs_c0EmfIEb2DsKbPO70/s320/GasPrices2.png" width="320" /></a>When you go to this website you have several options for prices and you can download a year of data at a time (with a CSV as an option). The obvious choice is regular gasoline but you might want to consider things like comparing regular gas to alternative fuels like propane. For example in this case, you can see that, in general, propane also has risen in price over time but where gasoline seems to fluctuate similarly regardless of the city, propane seems to be more volatile depending on location.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjl53wQWvisGznYnaoSfWLt2nm7kjgsaJ8sdLIB5YiUrdu7mGYkaa1NCNaUd-9zdtaGJZ-_b7j0QzUYDOK6jtE-I8XGNvA9HcKwRuqygeGfafIcxqAd4s2QN-aLx5pHvoqIObvQ6nqiC0Tu/s1600/GasPrices3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjl53wQWvisGznYnaoSfWLt2nm7kjgsaJ8sdLIB5YiUrdu7mGYkaa1NCNaUd-9zdtaGJZ-_b7j0QzUYDOK6jtE-I8XGNvA9HcKwRuqygeGfafIcxqAd4s2QN-aLx5pHvoqIObvQ6nqiC0Tu/s640/GasPrices3.png" width="560" /></a></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIHv3_qfZekxYkOe_fRu2wC94aAyASM8znqUjFmarH0qKRtv-IKGLUJDayt2sPrB3LIKkaBZGfDPTWDp3P_A6f6EM2Q1TLCdBZQvSdSFCLquJ1O8R7cDbG5z_9x2II0epd36UYr6IJu19U/s1600/GasPrices4.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="222" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIHv3_qfZekxYkOe_fRu2wC94aAyASM8znqUjFmarH0qKRtv-IKGLUJDayt2sPrB3LIKkaBZGfDPTWDp3P_A6f6EM2Q1TLCdBZQvSdSFCLquJ1O8R7cDbG5z_9x2II0epd36UYr6IJu19U/s320/GasPrices4.png" width="320" /></a><br />
Because of the shear amount of data points possible (you can get a weekly average for the last 25 years for several cities if you want), you may wish to stick to yearly values. Another option is to use some of he weekly values to talk about the dangers of extrapolation<br />
<br />
<br />
<br />
<h2>
Download the Data</h2>
Site <a href="http://www.energy.gov.on.ca/en/fuel-prices/" target="_blank">http://www.energy.gov.on.ca/en/fuel-prices/</a><br />
I have also taken the liberty of downloading all of the data for gasoline (all 25 years of it) in weekly, monthly and yearly form. As well as the yearly propane data. You can get it on this <a href="https://drive.google.com/open?id=1RElhGPYwPXyGwgGV5Pxbfr1AbI9epxecIPou0CUX6pk" target="_blank">Google sheet</a> (note the tabs) or just the gas prices on <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GWERJQlJjbFo0X3c" target="_blank">Fathom</a><br />
<br />
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.2px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com3tag:blogger.com,1999:blog-2271061401959643709.post-68537522636071298812016-05-13T16:08:00.001-07:002021-05-16T10:46:54.510-07:00The Data and Story Library - DASL<a href="https://dasl.datadescription.com/wp-content/uploads/sites/6/2018/08/ddlib.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="262" data-original-width="800" height="130" src="https://dasl.datadescription.com/wp-content/uploads/sites/6/2018/08/ddlib.jpg" width="400" /></a>DASL (pronounced "dazzle") is the <a href="http://dasl.datadesk.com/" target="_blank">Data and Story Library</a> is an awesome database of sets of data that are specifically to help teach topics of statistics. They are all real sets and are all categorized by topic/subtject (eg automotive, food, health, sports etc) and mathematical method (eg boxplots, mean, outliers, regression, scatterplots etc). So theoretically if you wanted to find a set of data that could be used to help teach a specific topic you could search for, say, "correlation"<br />
These are some great data sets to get through the mechanical nature of statistics. It's not very current data but it's great for practicing statistical methods.<br />
For the longest time this set of data was not available but just recently it was hosted by Data Description Inc. so now we have access to it again.<br />
<h2>
Analysis</h2>
There are far too many sets to talk about analysis but when the site was down I <a href="http://found-data.blogspot.ca/search/label/smoking" target="_blank">blogged</a> about one of my favourite sets on <a href="http://dasl.datadesk.com/story/view/24" target="_blank">Smoking and Cancer</a>. Take a look at that post to get a sense of the data. When you get to any data set, to see the actual data file, click on the Datafile Name<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsWfMQn7-36y80P0uCnfh88KQbgwCv2qDiG9vY6bJgGsjw3D8UZoWlFTSYVhRSSjREgGk_ixTZA6iBnvu1PIj0mJBz72SOE6YH3qbHA8FH5OJFUX-UO-Selk6nnLRBy9M5UcagKqHw2vA-/s1600/DASL1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsWfMQn7-36y80P0uCnfh88KQbgwCv2qDiG9vY6bJgGsjw3D8UZoWlFTSYVhRSSjREgGk_ixTZA6iBnvu1PIj0mJBz72SOE6YH3qbHA8FH5OJFUX-UO-Selk6nnLRBy9M5UcagKqHw2vA-/s640/DASL1.png" width="540" /></a></div>
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh74cPzoD5az87-O4HEovQQi9QGK66WET7_A4FSwZ_ypbW-fkiA7F7sbSshcUDOj5lCiTB6-Ov37YPA6jXSg57XH-V99GuBQutIHjHbuLSo3D-FOr3DtDOzEQALA_OGT7zqTWzkniu4Yo-w/s1600/DASL4.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="223" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh74cPzoD5az87-O4HEovQQi9QGK66WET7_A4FSwZ_ypbW-fkiA7F7sbSshcUDOj5lCiTB6-Ov37YPA6jXSg57XH-V99GuBQutIHjHbuLSo3D-FOr3DtDOzEQALA_OGT7zqTWzkniu4Yo-w/s320/DASL4.png" width="320" /></a>This will show you the <a href="http://dasl.datadesk.com/data/download/26" target="_blank">text file</a> of the data with the download link at the top of the page.<br />
From that point you can do the analysis. Each data set will have a detailed description of each variable and a short story and sample analysis of each set<br />
There are many data sets on this site for every statistical topic and on a range of subjects. One thing you might have your students do is just explore on this site and find data sets that can be used to exemplify a particular statistical concept.<br />
<h2 style="clear: both; text-align: left;">
Download the Data</h2>
<div class="separator" style="clear: both; text-align: left;">
Site: <a href="http://dasl.datadesk.com/">http://dasl.datadesk.com/</a> </div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.2px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span><br />
<br />David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-77100558016061544442016-03-05T20:31:00.000-08:002018-08-26T17:44:48.754-07:00Speed DataA few weeks ago I saw this Tweet<br />
<blockquote class="twitter-tweet" data-lang="en">
<div dir="ltr" lang="en">
Hey <a href="https://twitter.com/hashtag/MTBoS?src=hash">#MTBoS</a>, I need help finding some data. For a car accelerating from 0 mph, I want velocity as a function of time. Help? Link?</div>
— Michael Fenton (@mjfenton) <a href="https://twitter.com/mjfenton/status/695708972343406592">February 5, 2016</a></blockquote>
<script async="" charset="utf-8" src="//platform.twitter.com/widgets.js"></script>
I used to have some data kicking around my computer but I did a quick Google search and found that Car & Driver was a huge source of this type of data. And I love that you can get some of the data with their original <a href="http://media.caranddriver.com/files/2015-porsche-918-spyder-feature-car-and-driver2015-porsche-918-spyder.pdf" target="_blank">hand written data sheets</a>. BTW, here is @MJFenton's finished activity<br />
<blockquote class="twitter-tweet" data-lang="en">
<div dir="ltr" lang="en">
I just put together a new modeling lesson based on the Porsche 918 Spyder. Take it for a spin? <a href="https://t.co/iMiS7ZRaAs">https://t.co/iMiS7ZRaAs</a> <a href="https://twitter.com/hashtag/FeedbackWelcome?src=hash">#FeedbackWelcome</a></div>
— Michael Fenton (@mjfenton) <a href="https://twitter.com/mjfenton/status/697557687228243968">February 10, 2016</a></blockquote>
<script async="" charset="utf-8" src="//platform.twitter.com/widgets.js"></script>
And the<a href="https://teacher.desmos.com/activitybuilder/custom/56bba82b4b336120062011da" target="_blank"> teacher version</a>.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-IMrR0qiSxwPfslu02M4XWlisyK1IxtFzS3_H-s9B8z_16mutZJkxquQrObFvXfuhNWT8rxLEYjqV89j8NxrvkeAQg1TNZmubeCSjy9R2dReYyX5llHZ29lbPL3rNQaTMmmPDUiylauOe/s1600/SpeedData-Desmos.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-IMrR0qiSxwPfslu02M4XWlisyK1IxtFzS3_H-s9B8z_16mutZJkxquQrObFvXfuhNWT8rxLEYjqV89j8NxrvkeAQg1TNZmubeCSjy9R2dReYyX5llHZ29lbPL3rNQaTMmmPDUiylauOe/s640/SpeedData-Desmos.png" width="550" /></a></div>
<h2>
The Analysis</h2>
Let's start with the data set from the above post. You can certainly do Desmos Need for Speed activity. The analysis in terms of determining a function is a little intense (IE not a standard function model). You can see some of the more exact analysis via the two links in the tweet below.<br />
<blockquote class="twitter-tweet" data-lang="en">
<div dir="ltr" lang="en">
<a href="https://twitter.com/davidpetro314">@davidpetro314</a> Here’s a decent fit: <a href="https://t.co/dLVznfpk1d">https://t.co/dLVznfpk1d</a> And <a href="https://twitter.com/squishythinking">@squishythinking</a> suggested this improvement: <a href="https://t.co/9ydyQdcZlf">https://t.co/9ydyQdcZlf</a></div>
— Michael Fenton (@mjfenton) <a href="https://twitter.com/mjfenton/status/698591828841332736">February 13, 2016</a></blockquote>
<script async="" charset="utf-8" src="//platform.twitter.com/widgets.js"></script>
But if you didn't want to go too deep you could just use it to talk about non linear relationships or you could use it to talk about rates of change as speed data comes up a lot in calculus.<br />
I have also found more data sets from different cars and you can see how they compare to each other on this<a href="https://www.desmos.com/calculator/t6ltpsvaju" target="_blank"> Desmos file</a>.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoKyLQsvpMZNeCv9V6q97oJ_8HtuKOBAoD1ClHXx_wTHLgf9fEKFZHCfsJGrpDoh8l2amVlVDQsc5D_zb5n2aYxqNmlQ5Joz1T585vwH2VKSBbl-g-etm_j0suWRhVRnObOGlNADTucJCd/s1600/CarDriver.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoKyLQsvpMZNeCv9V6q97oJ_8HtuKOBAoD1ClHXx_wTHLgf9fEKFZHCfsJGrpDoh8l2amVlVDQsc5D_zb5n2aYxqNmlQ5Joz1T585vwH2VKSBbl-g-etm_j0suWRhVRnObOGlNADTucJCd/s320/CarDriver.png" width="249" /></a></div>
<h2>
Download the Data</h2>
There actually is a lot of data that can be found on the Car & Driver site. Many of the cars in <a href="http://www.caranddriver.com/list-reviews-specialty-files" target="_blank">this link</a> have data sheets (you really have to search around on each page to find the data sheet). But I have downloaded a few of them (seen in the Desmos file above) and created a Google Sheet for each so you can copy and paste the data where ever you want.<br />
<a href="http://www.caranddriver.com/features/the-2015-porsche-918-spyder-is-the-quickest-road-car-in-the-world-feature-performance-data-and-complete-specs-page-2" target="_blank">Porsche Spyder</a> Data Sheet Google Sheet<br />
<a href="http://www.caranddriver.com/reviews/hotchkis-e-max-dodge-challenger-feature" target="_blank">Dodge Challenger</a> Data Sheet Google Sheet<br />
<a href="http://www.caranddriver.com/reviews/2012-slp-chevrolet-camaro-zl1-convertible-review" target="_blank">Chevy Camaro</a> Data Sheet Google Sheet<br />
<a href="http://www.caranddriver.com/reviews/2011-lingenfelter-cadillac-cts-v-road-test-review" target="_blank">Cadalac CTS</a> Data Sheet Google Sheet<br />
<a href="http://www.caranddriver.com/reviews/2016-chevrolet-malibu-lt-15-liter-test-review" target="_blank">Chevy Malibu</a> Data Sheet Google Sheet<br />
<a href="http://www.caranddriver.com/reviews/2007-hks-honda-fit-sport-turbo-specialty-file" target="_blank">Honda Fit</a> Data Sheet Google Sheet<br />
<a href="https://drive.google.com/open?id=0B0-WU3yuHa7GckJKTWp1OE8wT1k" target="_blank">All Google Sheets</a><br />
All data in CODAP <a href="https://drive.google.com/open?id=13oOhaQqOf7Hj1i9_Y6i3kpdUnQFDuf0R" target="_blank">file</a> (with <a href="https://drive.google.com/open?id=1gd8MJkJeJVFMF2OpgMeXPTjClbYGhuTR" target="_blank">graph</a>)<br />
<br />
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 14.56px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-79324911428651650042016-01-26T20:46:00.008-08:002021-05-08T14:07:55.179-07:00Magazines <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMpKRlGTcqERp3cLE76i9kj6OKnzm-1zauvrFK4x6PexytOQaNhoSBmQpH1xzCjCJB0AnmQo7kD1F-8AIl96YIgaHg4kJ85Ud5LIEPqyj4Irp3KHLGELkiTe4lVQ9UZpcBVailfU788aqc/s1600/Magazine3.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMpKRlGTcqERp3cLE76i9kj6OKnzm-1zauvrFK4x6PexytOQaNhoSBmQpH1xzCjCJB0AnmQo7kD1F-8AIl96YIgaHg4kJ85Ud5LIEPqyj4Irp3KHLGELkiTe4lVQ9UZpcBVailfU788aqc/s400/Magazine3.png" width="400" /></a>A while back I started doing this activity with my students on the first day. For homework I would tell them to go home and find two magazines, get their prices the number of pages and count the number of pages with ads on them. Once they brought that in then we would combine all the data into one set. I got the idea from browsing through an Oprah magazine and being shocked at how many pages I had to turn in order to get to a page that had actual content on it. Eventually I automated the process by using a <a href="https://docs.google.com/forms/d/1Dr6fgnRA8Gw5srEFi9vJEW3jxDo1kIV862mgpB0-Plw/viewform" target="_blank">Google Form</a> to collect the data. And by adding another criteria (the type of magazine), this actually turns into a pretty rich data set.<br />
<h2>
The Analysis</h2>
Certainly with this data set you can do any number of things pertaining to calculations (average, standard deviation, correlation etc) but I liked to use it to start to have a need to move from single variable analysis to two variable analysis. For example, the magazine in the current set with the highest number of ad pages is In Style with 380 add pages (which is definitely an outlier)<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgq4tpFYIRwY4Q6diPXFkYANXp3_bxmIfhX9AvIMmTx6-OR9XCB4gWT_cIVDtoVATUfs9_9fUCDjUrBnUpITN8WkGReaFxQPtPK7UL4pHu2qYNPiXsjz2nclAwbYW0Me9X83yBlFojwHR6c/s1600/Magazine1.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgq4tpFYIRwY4Q6diPXFkYANXp3_bxmIfhX9AvIMmTx6-OR9XCB4gWT_cIVDtoVATUfs9_9fUCDjUrBnUpITN8WkGReaFxQPtPK7UL4pHu2qYNPiXsjz2nclAwbYW0Me9X83yBlFojwHR6c/s320/Magazine1.png" width="550" /></a></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxrG1KQnuQ9hZ1xhL02NqH2EnxSC1rsGk83CKHXmSqcTVFX3lkdewJ9-fbNUbGHRA7a6T2hCwwfzB46ek5sRb4fR26Bv-HZ125m_b9X7uUtr9kKeczBba2U0AXxeGcq6YcE45xbtSpUKVe/s1600/Magazine2.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxrG1KQnuQ9hZ1xhL02NqH2EnxSC1rsGk83CKHXmSqcTVFX3lkdewJ9-fbNUbGHRA7a6T2hCwwfzB46ek5sRb4fR26Bv-HZ125m_b9X7uUtr9kKeczBba2U0AXxeGcq6YcE45xbtSpUKVe/s320/Magazine2.png" width="320" /></a>This seems outrageous and the hope is that this will intrigue the students into asking questions. And perhaps they will also realize that it's the magazine with the largest number of total pages. And that then presents a need to do a different type of analysis (two variable scatter plot). And when you do that analysis you will see that although 380 pages is proportionally a little high for a magazine with 620 total pages and is not so outrageous.<br />
This is a good data set to just look at the basic stuff (creating bar graphs, histograms, box plots, scatterplots, measuring central tendency, determining correlations, finding least squared lines etc)<br />
Other things you can do is look at the break up popularity of magazine (in your class or with this data set) by type of magazine. By breaking it up into types of magazine, you can have an opportunity for students to compare graphs . When students compare graphs, an important skill to have them demonstrate is to make sure the size and scales of the graph are similar. This data set can help facilitate that.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJkj6phXApH2GoBVqxBPUEgsyFBaM-CeMJZQLjSRJ7PuOS388gARPlliOB7eGJQHp_RzIX-bOYfMHywNwnEKeP1TF2YP50apoCFXn1a8JW1htRmqaei8856AgpVBH_rzow53GeNPJWzO1U/s1600/Magazine4.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJkj6phXApH2GoBVqxBPUEgsyFBaM-CeMJZQLjSRJ7PuOS388gARPlliOB7eGJQHp_RzIX-bOYfMHywNwnEKeP1TF2YP50apoCFXn1a8JW1htRmqaei8856AgpVBH_rzow53GeNPJWzO1U/s640/Magazine4.png" width="580" /></a></div>
<h2>
Sample Questions</h2>
<ul>
<li>Create histograms of each of the numerical attributes and plot the mean and median on each graph. Describe each histogram as skewed right, left or symmetrical and justify your answers</li>
<li>Compare the graphs of total pages to ad pages</li>
<li>What proportion of magazines would be Sports & Entertainment in the average household?</li>
<li>What type of distribution would the number of ad pages be described as? Justify your answer.</li>
<li>Are there any outliers in the number of ad pages? Do the outliers change if you consider the type of magazine instead of the whole group?</li>
<li>Is the number of total pages (or ad pages) in the magazine correlated with the price of the magazine?</li>
<li>If a magazine were to have 120 pages, how many of them would you expect to have ads? Is this number different if you consider the type of magazine instead of all the magazines in the group?</li>
</ul>
<h2>
Download the Data</h2>
<ul style="text-align: left;">
<li>You (or your students) can add to the existing data set using <a href="https://docs.google.com/forms/d/1Dr6fgnRA8Gw5srEFi9vJEW3jxDo1kIV862mgpB0-Plw/viewform" target="_blank">this form</a>. The current data can be then found on this <a href="https://docs.google.com/spreadsheets/d/1Xaf6RjAr5WFJuBjvDpEM_bCyh8jrnfg7d6DzEVDXQ5Q/edit?usp=sharing" target="_blank">Google Sheet</a>.</li>
<li>Fathom <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GelNqWWFUNFVPQ1k" target="_blank">file</a> (with <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GVDRoR3A2OUN3OGs" target="_blank">graphs</a>)</li>
<ul><li>CODAP <a href="https://codap.concord.org/releases/latest/static/dg/en/cert/index.html?url=https://drive.google.com/open?id=1fdpuxHr8_8CEbsd0mc_yhwy6IUdNjF-Y" target="_blank">file</a> <a href="https://codap.concord.org/releases/latest/static/dg/en/cert/index.html?url=https://github.com/davidpetro314/codap/blob/e8e96e8c5f808bf2ab84ef74013cc87fe07dfb9d/Magazines%20Example.codap" target="_blank">cccc</a></li></ul>
</ul>
<br />
<span face=""arial" , "tahoma" , "helvetica" , "freesans" , sans-serif" style="background-color: white; font-size: 13px; line-height: 14.56px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com2tag:blogger.com,1999:blog-2271061401959643709.post-53093886155725146642016-01-23T19:34:00.000-08:002018-08-26T16:46:15.723-07:00Trending DataI have known about all of these trending search engines and thought they were quaint but recently I have actually seen some examples of uses that make me believe they maybe worth more and worth talking about in an senior Data Management class. For example I saw this one from @NateSilver538<br />
<blockquote class="twitter-tweet" data-lang="en">
<div dir="ltr" lang="en">
Level of attention to Trump already so high that Palin isn't increasing searches for him much. But huge for Palin! <a href="https://t.co/BSIEwVj8P6">pic.twitter.com/BSIEwVj8P6</a></div>
— Nate Silver (@NateSilver538) <a href="https://twitter.com/NateSilver538/status/689566991691223041">January 19, 2016</a></blockquote>
<script async="" charset="utf-8" src="//platform.twitter.com/widgets.js"></script>
Another example is from the Science Friday Podcast talking about <a href="http://www.sciencefriday.com/segments/keeping-tabs-on-hate-through-google-searches/" target="_blank">tracking "hate"</a> through Google searches. Listen below:<br />
<iframe frameborder="no" height="200" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/238178556&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&visual=true" width="100%"></iframe>
The trending site used in both of those cases was <a href="https://www.google.ca/trends/" target="_blank">Google Trends</a> and has been around for a while. Basically you put in the search terms you wish to compare and it shows how often they were searched on Google. For example the Superbowl is coming up in a couple of weeks so if you search "<a href="https://www.google.ca/trends/explore#q=Superbowl&cmpt=q&tz=Etc%2FGMT%2B5" target="_blank">Superbowl</a>", it shouldn't be surprising that we get a periodic pattern:<br />
<script src="//www.google.ca/trends/embed.js?hl=en-US&q=Superbowl&cmpt=q&tz=Etc/GMT%2B5&tz=Etc/GMT%2B5&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script>
<br />
<br />
Once you have one search term, you can add others. For example, let's see how popular Christmas is compared to the Superbowl:<br />
<script src="//www.google.ca/trends/embed.js?hl=en-US&q=Superbowl,+Christmas&cmpt=q&tz=Etc/GMT%2B5&tz=Etc/GMT%2B5&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script>
<br />
Another place to look for trending terms is Twitter. And the site <a href="http://hashtags.org/">Hashtags.org</a> gives analytics. Here you enter a hashtag and get the last 24 hours of Twitter traffic for that hashtag (at least in the free version). You can't do a comparison of hashtags but you can search any hashtag you wish. However you could highlight<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL5QV2KHu0tlSPyPCs6O647Ay1-l3Xp5-G6amO5uMpKVeOyyYKl3tvqLjMN3uXB4dtb4oT93vS2_Goy56a6j63DmB7lTjjpore5oOgud6GP673J0uurpUDWE-Gmeyqk-1nS4b1yKe4cxk0/s1600/SnowStormTwitter.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL5QV2KHu0tlSPyPCs6O647Ay1-l3Xp5-G6amO5uMpKVeOyyYKl3tvqLjMN3uXB4dtb4oT93vS2_Goy56a6j63DmB7lTjjpore5oOgud6GP673J0uurpUDWE-Gmeyqk-1nS4b1yKe4cxk0/s1600/SnowStormTwitter.png" /></a></div>
<br />
Another place you can get trend data is <a href="https://www.quantcast.com/top-sites" target="_blank">Quantcast.com</a>. This site does analytics on website traffic in general<br />
<div style="text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS-wDyXbVSS0-4Jv0pM5yCFQfPa2ZxOoBuyktwjExEocT2PXabtLxzA3jETFGbX8qaf68biKBaJGAhKiwL__HzD6re3QA4nOUfOxA3_G9uOjiVxdQB5nRjg5RnXFexJOsAM0qkgKRQRv5r/s1600/Quantcast.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiS-wDyXbVSS0-4Jv0pM5yCFQfPa2ZxOoBuyktwjExEocT2PXabtLxzA3jETFGbX8qaf68biKBaJGAhKiwL__HzD6re3QA4nOUfOxA3_G9uOjiVxdQB5nRjg5RnXFexJOsAM0qkgKRQRv5r/s1600/Quantcast.png" /></a> </div>
You can get detailed analytics for free from any of the sites that are listed as directly measured.<br />
<h2>
The Analysis</h2>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlAP1bUVIy7N7vEAE8UkzvPy_n0YDRBqjWzzcixrnsBh_Mp8l0idWiR2AGyJw8ILQHv8y1TZIt3Fw3boYKt3SY1O_vl1G6DMdHohoF3niy_lfrtJEOMegUWJKTgsS18It6djEMVek5x6Vs/s1600/Quantcast+BarGraph.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="113" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlAP1bUVIy7N7vEAE8UkzvPy_n0YDRBqjWzzcixrnsBh_Mp8l0idWiR2AGyJw8ILQHv8y1TZIt3Fw3boYKt3SY1O_vl1G6DMdHohoF3niy_lfrtJEOMegUWJKTgsS18It6djEMVek5x6Vs/s200/Quantcast+BarGraph.png" width="200" /></a>Though with most of the trending sites, there is not much analysis to be done, we often hear about topics "trending" so these sites can be used to bring something concrete to class. But some simple analysis can be done with the Quantcast site by just importing the table of sites and you can do work on histograms and even bar graphs.<br />
<h2>
Sample Questions </h2>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDF8vENbzLfrgpQZoKjmFE9w6-MdzLWjBuwo_huItavaBF1hOvkxq7CodaNCqV9W2p0kPW49kdUQVqWm4AO8lVNJqBucDbMAyBn0ubX4GtE0zYMkp0RzHNykuXgmj-NPGlk2vK3ynBA4aB/s1600/Quantcast+Histogram.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="160" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDF8vENbzLfrgpQZoKjmFE9w6-MdzLWjBuwo_huItavaBF1hOvkxq7CodaNCqV9W2p0kPW49kdUQVqWm4AO8lVNJqBucDbMAyBn0ubX4GtE0zYMkp0RzHNykuXgmj-NPGlk2vK3ynBA4aB/s200/Quantcast+Histogram.png" width="200" /></a></div>
<ul>
<li>Find a trending topic on Twitter or Google. Verify the data using one of the trending analytic sites. Compare to a similar topic.</li>
<li>How does the traffic of the top 10 most popular sites compare to the next 10?</li>
<li>Are there any outliers in the set of most popular sites?</li>
</ul>
<h2>
Download the Data</h2>
Website: <a href="https://www.google.ca/trends/" target="_blank">https://www.google.ca/trends/</a><br />
Website: <a href="https://www.hashtags.org/" target="_blank">https://www.hashtags.org/</a><br />
Website: <a href="https://www.quantcast.com/top-sites" target="_blank">https://www.quantcast.com/top-sites</a><br />
Quantcast data (<a href="https://drive.google.com/open?id=1ICCtkbCnwHJyXxPex1gVRkXu2liMsBHK7Y4o4vRm5M8" target="_blank">Sheets</a>, <a href="https://drive.google.com/open?id=1BkInwh9pfohl9UWlmjs29Ff8ESgiZgRPMfEMePRpfq0" target="_blank">Sheets with graphs</a>, <a href="https://drive.google.com/open?id=0B0-WU3yuHa7Ga0xPMUtFX2tuem8" target="_blank">Fathom</a>, <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GcHh2VkhIMTZiWFU" target="_blank">Fathom with Graphs</a>, <a href="https://drive.google.com/open?id=1N2lu-03zJup17WI1Bt-g6X2iLLDDWFSp" target="_blank">CODAP</a>)<br />
<br />
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 14.56px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-20092439653963465202016-01-15T21:18:00.000-08:002016-01-22T12:57:57.347-08:00Where are the Rey Star Wars Toys?<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj8npuIPFZTHAglW1_Y1mvmBFahNs6J2yNPYs9ol1CZ5iVjC6gZWC8xg5DGnPN7JcugrVg1n_ZHzUAS6F1WERl_kOXoHJCLY2Yshtu2fJ9rMAJ_JRUUGKAyYb-yqzYctOOZ_jcecvON4cti/s1600/StarWarsToysData.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj8npuIPFZTHAglW1_Y1mvmBFahNs6J2yNPYs9ol1CZ5iVjC6gZWC8xg5DGnPN7JcugrVg1n_ZHzUAS6F1WERl_kOXoHJCLY2Yshtu2fJ9rMAJ_JRUUGKAyYb-yqzYctOOZ_jcecvON4cti/s400/StarWarsToysData.png" width="400" /></a>This comes from a post from <a href="http://fivethirtyeight.com/features/wheresrey-the-star-wars-heroine-is-featured-in-fewer-toys-than-all-the-new-dudes/" target="_blank">Five Thirty Eight</a> looking at the distribution of new toys from the new Star Wars film. This is just a simple data set that could be made into a bar graph where students might be interested in the data. And it seems like maybe the scarcity of Rey toys was not <a href="http://www.hypable.com/star-wars-toymakers-specifically-directed-to-exclude-rey/" target="_blank">accidental</a>.<br />
<h2>
The Analysis</h2>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTmSRiD212B3MEowxBWp_d-glueBGoxSv4lZGFKEbAZu2cBm9YVBUPsm1dR2CpiDjtRfMZC1Va164mEq3gR4vFut6QbRFP7n1DUNzjMnP5AJEV9X2I6MNFBR0UQAc23narvIFx_JqhiG9H/s1600/StarWarsToysCircle.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="232" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTmSRiD212B3MEowxBWp_d-glueBGoxSv4lZGFKEbAZu2cBm9YVBUPsm1dR2CpiDjtRfMZC1Va164mEq3gR4vFut6QbRFP7n1DUNzjMnP5AJEV9X2I6MNFBR0UQAc23narvIFx_JqhiG9H/s320/StarWarsToysCircle.png" width="320" /></a></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFx6Ar-uDrFSrlv_eqxErLcn4fv0Zp5LjszsOiSbiYBzy6qcD11oedSK2Njveh000jPG_ijqhcVfyg8vxR1UKxs8yK4vDJ0nXl3Y7a5SXV3MZhp1nyTF4y3hjCHc7w5OccKIX2jtcJJi8W/s1600/StarWarsToys.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFx6Ar-uDrFSrlv_eqxErLcn4fv0Zp5LjszsOiSbiYBzy6qcD11oedSK2Njveh000jPG_ijqhcVfyg8vxR1UKxs8yK4vDJ0nXl3Y7a5SXV3MZhp1nyTF4y3hjCHc7w5OccKIX2jtcJJi8W/s320/StarWarsToys.png" width="320" /></a>There is not much analysis for students to do here. They can create the bar graph and then answer some questions about it. The point here is that the data set itself is what is interesting for students. Students could also make a pie graph from the data since it represents 100% of the data. One of the good things this data set can do is help show why pie graphs aren't that good for analysis since the data is so close to each other (if just looking at the pie slices it is hard to tell which is bigger - without the percents showing). Most statisticians agree that, for the most part, pie graphs are <a href="http://www.businessinsider.com/pie-charts-are-the-worst-2013-6" target="_blank">not very informative</a>. Yet we see them all the time. For example, look at the two representations to the right. The bar graph and pie graph show the same information but the pie graph is only useful for specific analysis if the percentages are actually shown. Otherwise it would be hard to determine the relative sizes of the pieces of pie and thus the relative weights of each type of toy. The problem becomes even worse when you use a 3D pie graph (so often used on news shows) and without the percents you cannot tell the difference in size between many of the pies. Of course the pie graph looks nicer, though.<br />
<h2>
Sample Questions</h2>
<ul>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaVL6Dn9sedRADZkhbEC8Iw3jtWz2KAYHgR9YyhMUBfr6TEkXd-hS4oigSQi1TLxJKzl9DQ5NSiTxlwMyxI4o86UcYcyZWTXXJ019lrXV-sqlPnAptS08iCwW6_BiOFbc0olyriSs9VcHS/s1600/StarWarsToysCircle2.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaVL6Dn9sedRADZkhbEC8Iw3jtWz2KAYHgR9YyhMUBfr6TEkXd-hS4oigSQi1TLxJKzl9DQ5NSiTxlwMyxI4o86UcYcyZWTXXJ019lrXV-sqlPnAptS08iCwW6_BiOFbc0olyriSs9VcHS/s320/StarWarsToysCircle2.png" width="320" /></a>
<li>By what percentage do the number of Kylo Ren toys surpass BB-8?</li>
<li>Which type of graph would be better for this data, bar or circle? Justify your choice.</li>
</ul>
<h2>
Download the Data</h2>
Google <a href="https://drive.google.com/open?id=1_A-Pf6HSP8AqqGlJIO01_u6Xz734qfHWyAIN-Smoy-s" target="_blank">Sheets</a> (with <a href="https://drive.google.com/open?id=12dWfENYEpbqi-Up8ERyckUaCEc0_ISquziMV1s_JV0A" target="_blank">graphs</a>)<br />
The original post<br />
<a href="http://fivethirtyeight.com/features/wheresrey-the-star-wars-heroine-is-featured-in-fewer-toys-than-all-the-new-dudes/" target="_blank">http://fivethirtyeight.com/features/wheresrey-the-star-wars-heroine-is-featured-in-fewer-toys-than-all-the-new-dudes/</a>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-46094376330927256972016-01-06T08:31:00.002-08:002021-05-10T20:03:00.899-07:00Earthquake DatabaseLast week friends of mine felt a 4.8 magnitude earthquake on Vancouver Island. So it seems like a perfect time to post some resources on data about earthquakes. As it turns out, depending on the magnitude, there are a lot of earthquakes that happen world wide each year. And we can get that data, almost realtime, from any number of earthquake databases. I like the one that the <a href="http://earthquake.usgs.gov/earthquakes/search/" target="_blank">US Geological Service</a> provides. This lets you set a few options and search earthquakes based on those options. The default is then a map that shows the result of your search.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBo6UZf0KA9KuRlDqv08plHlZ29AeghKHVwvr2rqsDyDgOI_ET_JNRdWSgJm7CAmyc0cFfqF7ere6VGWjny1w4rO7Omiyg9bHtEljHBIYU4w4-a44JJ9kRx81sp6hgqmL1tlX3as3geIF1/s1600/Earthquake2.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBo6UZf0KA9KuRlDqv08plHlZ29AeghKHVwvr2rqsDyDgOI_ET_JNRdWSgJm7CAmyc0cFfqF7ere6VGWjny1w4rO7Omiyg9bHtEljHBIYU4w4-a44JJ9kRx81sp6hgqmL1tlX3as3geIF1/s320/Earthquake2.png" width="550" /></a></div>
<h2 style="clear: both; text-align: left;">
The Analysis</h2>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5sNmVPLz2PT3uFiZngrj9ItYB1D7KwnAHD5_3lVPcN1RCvdPo6dvnOB32ynA9ZV3pTjlShHrdTxRsJKEFRSoHuqzuTYt-YTl39YBxyRAsQkHeKpgxh7slXNeLg4O60kazbEq5ItpMpDoq/s1600/Earthquake1.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5sNmVPLz2PT3uFiZngrj9ItYB1D7KwnAHD5_3lVPcN1RCvdPo6dvnOB32ynA9ZV3pTjlShHrdTxRsJKEFRSoHuqzuTYt-YTl39YBxyRAsQkHeKpgxh7slXNeLg4O60kazbEq5ItpMpDoq/s200/Earthquake1.png" width="200" /></a>Once you chose which options to use, then you have to get the data. I suggest that you limit your searches originally to those over magnitude 6 if you are looking at an extended time period (in 2015 there were over 140. If you play around with the magnitude (say dropping the threshold to 4.5) then you could get a huge amount (which you may or may not want). For example, if you drop that threshold to 4.5 there are over 6800 earthquakes found from 2015.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYKCrFUtVdZdd5I06EMcB7heC83lzGLxuAxd8jxShTkOhevqB5MQ5CgLXAOaSErZd0m1wfEoD6yrKnQE9syPHIlGYPgwvyS6RQMbgQDh8u6GWakiIwaSx6vH5NZE9WZAtz_K9uv1FMgA4i/s1600/Earthquake3.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="83" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYKCrFUtVdZdd5I06EMcB7heC83lzGLxuAxd8jxShTkOhevqB5MQ5CgLXAOaSErZd0m1wfEoD6yrKnQE9syPHIlGYPgwvyS6RQMbgQDh8u6GWakiIwaSx6vH5NZE9WZAtz_K9uv1FMgA4i/s200/Earthquake3.png" width="200" /></a>Once you get the data, you can just click the Download button on the top left to choose a CSV file that can be imported into any spreadsheet or Fathom. The obvious analysis here is a single variable set of the Magnitude (they call it mag in the data set). So you could do any number of histograms, box plots, dot plots etc as well as measures of central tendency and standard deviation. It's a really good data set for having students go through all the basic calculations needed when doing a single variable analysis.<br />
<br />
Depending on when you get your data you will get outliers.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrh9s0JUmXo2yJrdGisz9tJ8RBHKWQijy45srWXylrDhvy-EJ-eB6vjEO5tyj3KBc_D85evGQRNa2qT0KP7CmXRUR1PYtUhDBbC6ttz534RSpUunMPwgbQ6MSJYuQXzC-ilx-ZPIVNjg2j/s1600/Earthquake4.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrh9s0JUmXo2yJrdGisz9tJ8RBHKWQijy45srWXylrDhvy-EJ-eB6vjEO5tyj3KBc_D85evGQRNa2qT0KP7CmXRUR1PYtUhDBbC6ttz534RSpUunMPwgbQ6MSJYuQXzC-ilx-ZPIVNjg2j/s1600/Earthquake4.png" /></a></div>
<br />
Usually the data will come out skewed to the right as most of the quakes are typically at the low end (this is regardless of what you choose as your threshold.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvDr2EuICKE9YnNLRRAm8rGAQmt3qU-v30K1r7gwXQpg9aNf6uaDUBuLLE8veGic2mV4b6elUYWpPR8by_wzDN6m_c8Z0b32ZO3D7fC1pLRGfNpBQSQJJMMF_6pzqTebQA0ngciH64u72C/s1600/EarthQuake5.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvDr2EuICKE9YnNLRRAm8rGAQmt3qU-v30K1r7gwXQpg9aNf6uaDUBuLLE8veGic2mV4b6elUYWpPR8by_wzDN6m_c8Z0b32ZO3D7fC1pLRGfNpBQSQJJMMF_6pzqTebQA0ngciH64u72C/s1600/EarthQuake5.png" /></a></div>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibpjMUeQm5iXulj76GwMgf-3AfZPT0L6d_C3e6EYvdCLWuR_g9KzvoKGQY7fAhrDdPGQ1TUvv0_N6xJwellaOYF_mYfU5cdPMSdHE_EXPbeQoAxXg1Z2QZSdSP7lxVY2qMWabzdYk1qM_C/s1060/CODAP-Map.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="702" data-original-width="1060" height="253" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibpjMUeQm5iXulj76GwMgf-3AfZPT0L6d_C3e6EYvdCLWuR_g9KzvoKGQY7fAhrDdPGQ1TUvv0_N6xJwellaOYF_mYfU5cdPMSdHE_EXPbeQoAxXg1Z2QZSdSP7lxVY2qMWabzdYk1qM_C/w383-h253/CODAP-Map.png" width="383" /></a></div>You can also do a neat "heat map" by choosing Map in CODAP and dragging something like the Magnitude onto the middle of the graph so it appears as a colour spectrum. This can be done in Fathom by plotting the Longitude and Latitude (and thus getting a map) onto the regular graph.<br />
<br /><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOVTG5C5A02L0YLEQVH-_hpC_5vbCePi7-xNI49NoOp8syQpCMIxYSrODsDyIZ-YZigwGAjjNo3yS1d9uAeOYj4gqPUK3JTf6BvlI0cKuGprW6L27BC6XVud7m0sI7zRGkRZxi7BO4PCEU/s1060/CODAP-Map.png" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><br /></a></div></div><div class="separator" style="clear: both; text-align: left;">Here's a quick video on getting this data from the database into CODAP to use the Mapping feature:</div><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="BLOG_video_class" height="322" src="https://www.youtube.com/embed/dB2or9FkPRk" width="553" youtube-src-id="dB2or9FkPRk"></iframe></div><br /><div class="separator" style="clear: both; text-align: left;"><br /></div>
<h2>
Sample Questions</h2>
<ul>
<li>Determine the measures of central tendency for the magnitude of the earthquakes</li>
<li>Determine the five number summary for the magnitude of the earthquakes</li>
<li>Which earthquake(s) were the most extreme? Where they outliers?</li>
<li>How are the measures of central tendency affected if you remove the outlier(s) when looking at the magnitude of the earthquakes?</li>
<li>Determine whether the data for the magnitude of the earthquakes is skewed to the right or left.</li>
</ul>
<h2>
Other Earthquake Data</h2>
<div>
If students are trying to do something more with their earthquake data (like analyze then make sense of it) they might try getting more info at <a href="http://www.iris.edu/hq/" target="_blank">IRIS</a> (Incorporated Research Institutions for Seismology). There they have some of the same data and more plus other info that might be relative. Thanks to <a href="https://twitter.com/frankmcgowa" target="_blank">@frankmcgowa</a> for that one</div>
<h2>
Download the data</h2>
<ul>
<li>Website: <a href="http://earthquake.usgs.gov/earthquakes/search/" target="_blank">http://earthquake.usgs.gov/earthquakes/search/</a></li>
<li>Sample data for 2015 (<a href="https://drive.google.com/open?id=10sgD1HUvU_opAkbjvZOW3DgSee2PpT0CjTfg8hs78is" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GS2xHbEFjRWZBZDg" target="_blank">Fathom</a>, <a href="https://drive.google.com/open?id=0B0-WU3yuHa7Gb0dsUmJtZ3FOeUU" target="_blank">Fathom with graphs</a>, <a href="https://drive.google.com/open?id=1xWGNkzCGp_hRAK84d-C_3A4UM8EcTchX" target="_blank">CODAP</a>)</li>
</ul>
<div class="post-body entry-content" id="post-body-4797273810764863154" itemprop="description articleBody" style="-webkit-text-stroke-width: 0px; background-color: white; color: black; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 14.56px; orphans: auto; position: relative; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 1; width: 586.4px; word-spacing: 0px;">
Let me know if you used this data set or if you have suggestions of what to do with it beyond this.<br />
<div style="clear: both;">
</div>
</div>
David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-82808842587715112952015-12-21T20:39:00.000-08:002018-08-26T16:37:33.642-07:00How much would you pay for a $50 Gift Card?How much would you pay for a gift card on eBay? Perhaps, let me back up a bit. Maybe for Christmas someone gets me a Tiffany's gift card. I will likely not be going to Tiffany's any time soon (don't tell my wife). So that gift card is not worth much to me. But it may be worth something to someone else. So being an enterprising person, I put it up for auction on eBay. I wouldn't expect to sell it for more than what the gift card is worth (you would think). So the question then is, what percent of the actual value of the card will I be able to sell it for? Well years ago the crew at <a href="http://freakonomics.com/" target="_blank">Freakonomics</a> shared this data set of of <a href="http://graphics8.nytimes.com/images/blogs/freakonomics/pdf/eBayGiftCards.xls" target="_blank">100 gift cards</a> and what they sold for on eBay. The data is almost 10 years old but it still turns out that this is a fairly rich data set.<br />
<h2>
The Analysis</h2>
So the attributes in this set are the card type (Best Buy, iTunes etc), the value of the card, how much it sold for, what were the shipping costs, how many bids did it have, what was the feedback rating of the seller, the percentage of the sale (including the shipping), the average percentage per card and the actual link of the auction. So that means there are a large amount of things you can analyse. For single variable stuff you could find measures of central tendency for the entire set or individually for each type of card. Or just choose your type of single variable graph and create it for the whole group or by card type. <br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoGNaAAjPe5HLPDNBvLE9vIpImJbFkhV9PUXNjQ5biqkbOkye1Lail4yClviSQpsaOS3KGVTCuF6UkZ3Ma5t_lETI6oDJ7NqqirnEVJn7NV8VdjL5SYOATF_bbiV8uj1Lv833-xpsgxxol/s1600/GiftCards-boxplot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoGNaAAjPe5HLPDNBvLE9vIpImJbFkhV9PUXNjQ5biqkbOkye1Lail4yClviSQpsaOS3KGVTCuF6UkZ3Ma5t_lETI6oDJ7NqqirnEVJn7NV8VdjL5SYOATF_bbiV8uj1Lv833-xpsgxxol/s1600/GiftCards-boxplot.png" /></a></div>
Or you could do some double variable analysis comparing to see the connection between the value of the card and the sale price (for either the whole group or by card type.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyt0X1WYoolyXYMpjc4gOEW1fmx1sBTO_peHfzfjy1LERS2QgQyx7cCTMaWeECwoMYdvaq_Sil9s1itS_SEs1gv87Wunf66GoB6TFEyAM_Ts8XrmX9ft0xhYgk0_b0p3ytw_ei2QaTBbBO/s1600/GiftCards-Scatter.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyt0X1WYoolyXYMpjc4gOEW1fmx1sBTO_peHfzfjy1LERS2QgQyx7cCTMaWeECwoMYdvaq_Sil9s1itS_SEs1gv87Wunf66GoB6TFEyAM_Ts8XrmX9ft0xhYgk0_b0p3ytw_ei2QaTBbBO/s1600/GiftCards-Scatter.png" /></a></div>
And because the data exists, you could even do some comparisons of the average percentage that a card gets.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHuQJAt2eslS2ws7B8aI-IC0L7yFfKU1dxYwczYysMTRSe52PFPzKvw-UxEmZs_cjrXLwP6BDDj6ERKr7zQkr2mX15bE4KgJIogw-08pMGlFFZE6nlluIGNGKp9oj85nWHuSKtnYBU89oC/s1600/GiftCards-Percent.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHuQJAt2eslS2ws7B8aI-IC0L7yFfKU1dxYwczYysMTRSe52PFPzKvw-UxEmZs_cjrXLwP6BDDj6ERKr7zQkr2mX15bE4KgJIogw-08pMGlFFZE6nlluIGNGKp9oj85nWHuSKtnYBU89oC/s1600/GiftCards-Percent.png" /></a></div>
<h2 style="clear: both; text-align: left;">
Sample Questions</h2>
<ul>
<li>Identify the outliers for each card type (Value, sold etc) and suggest why they might be outliers</li>
<li>Identify the spread for the Value of each card type. Why might some cards have smaller spreads than others?</li>
<li>How does the linear regression compare for different types of cards?</li>
<li>Are there any cards that were sold for more than they were worth? What might cause someone to pay more for a card than what it is worth?</li>
<li>Why might some cards have a higher average sale rate?</li>
</ul>
<h2>
Other Stories</h2>
This data came out of a s<a href="http://www.nytimes.com/2007/01/07/magazine/07wwln_freak.t.html?ex=1325826000&en=970d53de24147ae4&ei=5090&partner=rss" target="_blank">tory originally</a> about why companies love gift cards (and the <a href="http://freakonomics.com/2007/01/07/freakonomics-in-the-times-magazine-gift-card-economy/" target="_blank">page of supporting data</a> for the article) As it turns out they actually tend to be like free money. This is because so often people don't use up all of their gift cards and then forget about them. I think part of that is because we are required to know exactly how much is left on a gift card in order to use it. They actually show the data (<a href="http://graphics8.nytimes.com/images/blogs/freakonomics/pdf/Best%20Buy%20Fiscal%202006%20Annual%20Report.pdf" target="_blank">on pg 65</a>) for Best Buy on how much extra money they made because of unused gift cards (spoiler alert, it was $43 million)<br />
<h2>
Download the Data</h2>
<ul>
<li>The original Spreadsheet (<a href="http://graphics8.nytimes.com/images/blogs/freakonomics/pdf/eBayGiftCards.xls" target="_blank">Excel</a>, <a href="https://drive.google.com/open?id=1GzxYyM33ShFiAJryM-Wxzh0ASi1aRL-zd-3E0CcnXGU" target="_blank">Google Sheets</a>)</li>
<li>Fathom (<a href="https://drive.google.com/open?id=0B0-WU3yuHa7GMGkwVm1KUzcxYkk" target="_blank">Data</a>, <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GMEw0YXllOVZiYTg" target="_blank">With Graphs</a>)</li>
<li>CODAP <a href="https://drive.google.com/open?id=1updrpTvyL9G9ws_090K5kpabQO6er1QJ" target="_blank">file</a></li>
</ul>
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 12.1333px;">Let me know if you used this data set or if you have suggestions of what to do with it beyond this.</span>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0tag:blogger.com,1999:blog-2271061401959643709.post-62965394129003619492015-12-17T08:56:00.000-08:002019-04-02T08:34:30.762-07:00Movie Data<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi39HjK0Ogt96GvYQ7UI_-54gZh3ge7BIgivXquUWMzHFERO4nmpc6Dl9Yu_zxU2gzHjzJqfGTYVOCIRsKlJys_dVmkgWAukSUJxT8aNjT2KYbOr6JjtjNZJCOrHGmxIjOcI91IxCwuCGDN/s1600/Star+Wars.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="173" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi39HjK0Ogt96GvYQ7UI_-54gZh3ge7BIgivXquUWMzHFERO4nmpc6Dl9Yu_zxU2gzHjzJqfGTYVOCIRsKlJys_dVmkgWAukSUJxT8aNjT2KYbOr6JjtjNZJCOrHGmxIjOcI91IxCwuCGDN/s320/Star+Wars.png" width="320" /></a>Given that as I type this the new <a href="http://www.the-numbers.com/movies/franchise/Star-Wars#tab=summary" target="_blank">Star Wars</a> movie coming out this week it seems like a perfect time to highlight some places to go get data about movies. So there are a pile of places to go. And kids (and most humans) love movies so why not find some data that kids will be more engaged to explore. As it turns out there are a few really great places to get real time data on movies. I'm going to focus on two.<br />
<h2>
Box Office Mojo</h2>
<a href="http://www.boxofficemojo.com/img/misc/bom_logo1.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://www.boxofficemojo.com/img/misc/bom_logo1.png" /></a>The first one is <a href="http://www.boxofficemojo.com/" target="_blank">http://www.boxofficemojo.com/</a>. There is a lot of data that you can choose from and it is almost realtime. For example you can click on <a href="http://www.boxofficemojo.com/daily/" target="_blank">Daily</a> and it will give the summary of total domestic (US) ticket sales for each day. Or at the top if you click the <a href="http://www.boxofficemojo.com/daily/chart/" target="_blank">daily summary</a> you will get the top movies of the day and how much they made (among other things, right down to the dollar). You can even drill down and click on the <a href="http://www.boxofficemojo.com/movies/?page=daily&view=chart&id=heartofthesea.htm" target="_blank">movie name</a> to get things like how many theatres it is in. One of the other neat things is they have "Showdowns" of movies and do comparisons like this one from <a href="http://www.boxofficemojo.com/showdowns/chart/?id=martiangravity.htm" target="_blank">Interstellar, Gravity and The Martian</a>. But by far the coolest thing is the <a href="http://www.boxofficemojo.com/alltime/?page=bychart&p=.htm" target="_blank">all time chart</a> which gives the records for a huge number of metrics.<span id="goog_758033473"></span><a href="https://www.blogger.com/"></a><br />
<span id="goog_758033472"></span><br />
<h2>
The Numbers</h2>
<a href="http://www.the-numbers.com/images/the-numbers-banner.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://www.the-numbers.com/images/the-numbers-banner.png" height="40" width="320" /></a>The second site I like is <a href="http://www.the-numbers.com/" target="_blank">http://www.the-numbers.com/</a> , Here you can get some of the same stats like the <a href="http://www.the-numbers.com/box-office" target="_blank">box office</a> info from any day of any year, but also stuff on <a href="http://www.the-numbers.com/weekly-dvd-sales-chart" target="_blank">DVD sales</a> as well as how <a href="http://www.the-numbers.com/bankability" target="_blank">bankable</a> a star is. And it even has a special <a href="http://www.the-numbers.com/movies/report-builder" target="_blank">Report Builder</a> page where you can generate your own report with the info you want. But for me, by far, the best part is their movie <a href="http://www.the-numbers.com/movie/budgets/" target="_blank">budgets page</a> where you can get the all time list of movies by <a href="http://www.the-numbers.com/movie/budgets/all" target="_blank">production budget</a> (over 5000 of them) or top 20 movies that were most profitable.<br />
<h2>
The Analysis</h2>
There is so much that you can do with this data that you could probably pick off any topic and find something to report on. But let me highlight a few of my favourite things to do. For example, with the daily movie data from Box Office Mojo (<a href="https://drive.google.com/open?id=0B0-WU3yuHa7GeV9uZnJkWlYxQ2s" target="_blank">Fathom</a>, <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GWGU4RFdyMk02WVk" target="_blank">Fathom Sol</a>, <a href="https://docs.google.com/spreadsheets/d/1q0LVSLzgHCWeR4sq6IRdbV48p7_4mZz2KAmY0Xfkc_Q/edit?usp=sharing" target="_blank">Google Sheet</a>). At the low end you could create histograms, dot plots and box plots, and compare measures of central tendency. At the higher end you can have them look for outliers or compare what happens day to day.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCzRNBKy2Qe97O9-uLLUirpHpArvzzR6Sv58lRhpeo2tQOA9WFUaXd20yljPOX2DwBbzw2GxZJqF3lS54NBnkdMoYvmSFh104gJFFl3rvKNn6h7nZYMQp0nvOwjJkq7F24idqHE06xrRKC/s1600/FathomDaily+Movie.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCzRNBKy2Qe97O9-uLLUirpHpArvzzR6Sv58lRhpeo2tQOA9WFUaXd20yljPOX2DwBbzw2GxZJqF3lS54NBnkdMoYvmSFh104gJFFl3rvKNn6h7nZYMQp0nvOwjJkq7F24idqHE06xrRKC/s1600/FathomDaily+Movie.png" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
That daily data was a summary, you can also take the daily data from The Numbers (<a href="https://drive.google.com/open?id=0B0-WU3yuHa7GdDR1RW5QdVlzYUE" target="_blank">Fathom</a>, <a href="https://drive.google.com/open?id=0B0-WU3yuHa7GWXB1cko0VUFFS3M" target="_blank">Fathom Sol</a>, <a href="https://docs.google.com/spreadsheets/d/1zC0Qr22a3MFESVun_Kntkl4bDMJ66crGkEPJdLE4axY/edit#gid=0" target="_blank">Google Sheet</a>) and my favourite thing to do after looking at the single variable analysis of the amount of money is to look at the two variable analysis of how the money compares to the number of theatres each movie was in. And then see if any of the movies might get lost in that data (like the Big Short which hardly played in any theatres but had the most tickets sold per theatre. Or that In the Heart of the Sea is doing better than expected and the Peanuts Movie is doing worse than expected<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq242UlcSE79xAJn2wrQBhbdFmk8bVobtXfRx41C-WoCw9-vZ7WH46s13hX8tsFnEj8c5YGY2zEDzpEo4upeBuc5gMslQmn7sv_GYQfu8j0kjy8q-Ufyqbb2gBSmTftAwqWDQ3EOMax7Xy/s1600/DailyMovieTheatres.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq242UlcSE79xAJn2wrQBhbdFmk8bVobtXfRx41C-WoCw9-vZ7WH46s13hX8tsFnEj8c5YGY2zEDzpEo4upeBuc5gMslQmn7sv_GYQfu8j0kjy8q-Ufyqbb2gBSmTftAwqWDQ3EOMax7Xy/s1600/DailyMovieTheatres.png" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
Another of my favourite things is to look at how movies did compared to what it cost to make them. There is a lot of info on this on The Numbers and one of my favourite examples is that of the <a href="http://www.the-numbers.com/movie/Blair-Witch-Project-The#tab=box-office" target="_blank">Blair Witch Project</a>. A movie that only cost $60,000 to make yet had a world wide total gross of almost $250 million. You can get the daily numbers for any movie like this and in this case see that this started out in one theatre, did well. Then expanded to about 30 theatres and did well and then finally got a much wider distribution and blew up. </div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjE7e3y5GVen9Myr4qTYqKW1H7cIgTvmk05YBjOVGC9TnLE4lGTvbz5YmS_VyL25k4nwT33ess9-WHiVAu7dbrY6HqDLWlFZ4jO0pbfkRFNXUNbG1sczX08pYJez7bMVTdphOR83nLRN6dG/s1600/BlairWitch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjE7e3y5GVen9Myr4qTYqKW1H7cIgTvmk05YBjOVGC9TnLE4lGTvbz5YmS_VyL25k4nwT33ess9-WHiVAu7dbrY6HqDLWlFZ4jO0pbfkRFNXUNbG1sczX08pYJez7bMVTdphOR83nLRN6dG/s1600/BlairWitch.png" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
That is just a small amount of what you could do with this data. Especially if you use the full set from the Numbers (<a href="https://drive.google.com/open?id=0B0-WU3yuHa7GYTE5dWdScXJMZkk" target="_blank">Fathom</a>, <a href="https://drive.google.com/open?id=1H2LLT2C4nMiEcDzqTKnhNwreHhbLulH_HDLtfPNmQlc" target="_blank">Google Sheets</a>)</div>
<h2 style="clear: both; text-align: left;">
Sample Questions</h2>
<ul>
<li>What I usually do with these sites is ask something more general. I introduce them and then just ask "What story does this data tell? Use graphs and calculations to tell your story."</li>
<li>Another thing I ask is to look at the all time list and use a site like <a href="http://natoonline.org/data/ticket-price/" target="_blank">http://natoonline.org/data/ticket-price/</a> to put everything in today's dollars. They can check their answers on the <a href="http://www.boxofficemojo.com/alltime/adjusted.htm" target="_blank">Box Office Mojo summary page</a> where they show that Gone With the Wind, adjusted for inflation, would have grossed over $1.7 billion domestically (there is no worldwide data). Or even look at the story that they tell about <a href="http://www.boxofficemojo.com/about/adjuster.htm" target="_blank">adjusted data</a>. The dataset on movie ticket prices alone is pretty good for analysis.</li>
<li>For the younger grades you could make bar graphs or circle graphs about their favourite movie franchise, for example, like Harry Potter (<a href="https://drive.google.com/open?id=1VF0F6z5LPU_vThChnc3n76UKt70kq-Go6JpQ2UyFQpo" target="_blank">Google Sheets</a>, <a href="https://drive.google.com/open?id=1_bjyiy5oXQ5K0iUprjpiWVD8Qa59TnPRarnwTlgXP0g" target="_blank">Google Sheets with Graphs</a>)</li>
</ul>
<h2>
Other Movie Resources</h2>
The <a href="http://fivethirtyeight.com/">FiveThirtyEight.com</a> site often does a lot of stories <a href="http://fivethirtyeight.com/tag/movies/" target="_blank">on movies</a> and there is a great podcast about the problems with the movie rating sites and how they handle data. Read and listen about it <a href="http://fivethirtyeight.com/datalab/rating-subjective-experiences-is-hard-but-fandango-is-really-bad-at-it/" target="_blank">here</a> and <a href="http://fivethirtyeight.com/features/fandango-movies-ratings/" target="_blank">here</a>. And of course there is the famous movie quotes as <a href="http://flowingdata.com/2010/03/08/data-underload-12-famous-movie-quotes/" target="_blank">visualizations</a><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://i1.wp.com/flowingdata.com/wp-content/uploads/2010/03/underload-122.png?zoom=2&resize=545%2C967" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://i1.wp.com/flowingdata.com/wp-content/uploads/2010/03/underload-122.png?zoom=2&resize=545%2C967" width="550" /></a></div>
<h2>
Download the Data</h2>
<ul>
<li>Of course go to <a href="http://www.the-numbers.com/" target="_blank">The Numbers</a> and <a href="http://www.boxofficemojo.com/" target="_blank">Box Office Mojo</a> at any time to get the most up to date data on movies. All the files I analyzed here can be found in <a href="https://drive.google.com/folderview?id=0B0-WU3yuHa7GcE5BVHJxS2FBaDA&usp=sharing" target="_blank">this folder</a>. Note that all of these files were generated BEFORE <a href="http://www.the-numbers.com/movie/Star-Wars-Ep-VII-The-Force-Awakens#tab=summary" target="_blank">Star Wars: The Force Awakens</a> came out so it will be interesting to see how it changes the data.</li>
</ul>
<br />
<span style="background-color: white; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.2px;">Let me know if you used these data set or if you have suggestions of what to do with it beyond this. Or if you created a lesson based on this data, share it below.</span>David Petrohttp://www.blogger.com/profile/16551690042242217798noreply@blogger.com0