CHAPTER 10. Graphs, Good and Bad

Size: px
Start display at page:

Download "CHAPTER 10. Graphs, Good and Bad"

Transcription

1 CHAPTER 10 Graphs, Good and Bad

2 DISPLAYING DATA The first part of this course dealt with the production of data, through random sampling and randomized comparative experiments. This particular unit focuses on good ways to summarize and organize data. 2

3 DATA TABLES Who did you vote for in the 2008 presidential election? One way to organize the responses for all Americans is to create a data table. Good data tables should contain the following things: A clear main heading Clearly labeled variables Rates (percentages or proportions) should be used either instead of or to supplement counts 3

4 EXAMPLE 10.1 Votes in 2008 Presidential Election Candidate Number of votes Percentage Barack Obama 69,456, % John McCain 59,934, % Ralph Nader 738, % Bob Barr 523, % Chuck Baldwin 199, % Cynthia McKinney 161, % Other 242, % Total 131,257, % Data tables show what values a variable takes and how often it takes these values. In other words, data tables present the distribution of a variables 4

5 TYPES OF VARIABLES Some variables place individuals into categories (like eye color or gender), while some variables have a meaningful numerical scale (like height, age, or exam score). There are two types of variables: A categorical variables places an individual into one of several categories. A quantitative variable takes numerical values for which arithmetic operations such as averaging make sense. 5

6 CATEGORICAL VARIABLES Pie charts and bar graphs are good ways to show the distribution of a categorical variable. So we could summarize our presidential election data with either a pie char or a bar graph. 6

7 PIE CHART Voters in 2008 Presidential Election 0% 0% 0% 0% 1% 46% 53% Obama McCain Nader Barr Baldwin McKinney Other 7

8 BAR GRAPH 80,000,000 Voters in 2008 Presidential Election 70,000,000 60,000,000 Number of Voters 50,000,000 40,000,000 30,000,000 20,000,000 10,000,000 0 Obama McCain Nader Barr Baldwin McKinney Other Candidate 8

9 PICTOGRAMS Another method of displaying the distribution of a categorical variable. What is a problem with this graphic? 9

10 PICTOGRAMS Here are two charts which display the same information Ownership among certain types of pets Often misleading because they misrepresent the difference between values of the categorical variable. The artists who produce pictograms often sacrifice the accuracy of data so that they can avoid distortion of the pictures being used 10

11 LINE GRAPHS Line graphs are used to display how a quantitative variable changes over time. A line graph of a variable plots each observation against the time at which it was measured. We always put time on the horizontal axis (x-axis) and the variable on the vertical axis (y-axis). We then connect each data point to display the change over time. 11

12 EXAMPLE 10.2 For any line graph, we want to look for an overall pattern and any striking deviations from that pattern. What is the overall pattern? Are there any striking deviations from that pattern. Count Sales of New Trucks Year 12

13 SEASONAL VARIATION Particular line graphs may display what is known as seasonal variation. This is a pattern that repeats itself at regular time intervals. Often times, series of regular measurements over time might be seasonally adjusted. This means that the expected seasonal variation is removed before the data are published. 13

14 EXAMPLE 10.3 Notice that the line graph has seasonal variation. We see that every year there is a spike in airline passengers. The overall trend here is an increase in airline passengers. 14

15 MISREPRESENTING DATA The most common method of misrepresenting data in line graphs is a result of picking certain scales. Notice how when I choose this scale, it looks like we have a rather slow increase in the number of unmarried couples over time. Unmarried Couples Unmarried Couples (thousands) Year 15

16 MISREPRESENTING DATA However, when we switch scales for the same data, we might be inclined to draw a different conclusion. While this line graph still shows an increasing trend, it looks much more dramatic than the previous line graph. U n m a r r i e d C o u p l e s ( t h o u s a n d s ) Unmarried Couples Year 16

17 MAKING GOOD GRAPHS Title, Label, Scale Make sure labels and legends describe variables and their measurement units. Be careful with the scales used. Make the data stand out We want to ensure that the data itself, rather than any background art or labels, catches the viewer s attention. Avoid pictograms and be careful when choosing scales. Avoid 3D effects or other graphics that might confuse people. 17

18 REMINDERS Chapter 10 homework is posted online and due Friday. 18