Dr. Allen Back. Aug. 26, 2016

Size: px
Start display at page:

Download "Dr. Allen Back. Aug. 26, 2016"

Transcription

1 Dr. Allen Back Aug. 26, 2016

2 AP Stats vs Some different emphases.

3 AP Stats vs Some different emphases. But generally comparable.

4 AP Stats vs Some different emphases. But generally comparable. So retaking not recommended.

5 AP Stats vs Some different emphases. But generally comparable. So retaking not recommended. Though many people do.

6 AP Stats vs Some different emphases. But generally comparable. So retaking not recommended. Though many people do. If you do, pls be sure to work/attend regularly.

7 AP Stats vs Some different emphases. But generally comparable. So retaking not recommended. Though many people do. If you do, pls be sure to work/attend regularly. Our exams quite different.

8 AP Stats vs Some different emphases. But generally comparable. So retaking not recommended. Though many people do. If you do, pls be sure to work/attend regularly. Our exams quite different. (And not that close to textbook.)

9 AP Stats vs Some different emphases. But generally comparable. So retaking not recommended. Though many people do. If you do, pls be sure to work/attend regularly. Our exams quite different. (And not that close to textbook.) Lecture a good guide to exam priorities.

10 AP Stats vs Some different emphases. But generally comparable. So retaking not recommended. Though many people do. If you do, pls be sure to work/attend regularly. Our exams quite different. (And not that close to textbook.) Lecture a good guide to exam priorities. (Some explanations will go beyond what you will be responsible for.)

11 Per Capita CO 2 Emissions Units are metric tons per person per year.

12 Per Capita CO 2 Emissions 8 Most Populous Countries in the World (a few years ago): Country tons/yr. China 2.3 India 1.1 US 19.7 Indonesia 1.2 Brazil 1.8 Russia 9.8 Pakistan.7?.2

13 Per Capita CO 2 Emissions 8 Most Populous Countries in the World: Country tons/yr. China 2.3 India 1.1 US 19.7 Indonesia 1.2 Brazil 1.8 Russia 9.8 Pakistan.7 Bangladesh.2

14 Per Capita CO 2 Emissions In order: tons/yr :

15 Per Capita CO 2 Emissions In order with positions: (n = 8) tons/yr : Posn. :

16 Per Capita CO 2 Emissions In order with positions: (n = 8) tons/yr : Posn. : The median is the middle value. A basic measure of center. When the sample size is even, we average the two middle values.

17 Per Capita CO 2 Emissions The median is the middle value. A basic measure of center. When the sample size is even, we average the two middle values. Sample size (n = 8) tons/yr : Posn. : median = = 1.5.

18 Per Capita CO 2 Emissions tons/yr : Histogram of all 8

19 The US value of 19.7 is not in keeping with the rest if the data. Such a value is called an outlier. Per Capita CO 2 Emissions tons/yr : Histogram of all 8

20 Per Capita CO 2 Emissions The US value of 19.7 is not in keeping with the rest if the data. Such a value is called an outlier. Histogram of all but US. (n=7)

21 Per Capita CO 2 Emissions The US value of 19.7 is not in keeping with the rest if the data. Such a value is called an outlier. Histogram of all but US. (n=7) Removal of the outlier gives a much more revealing histogram.

22 Per Capita CO 2 Emissions Without the outlier: (n = 7) tons/yr : Posn. : With n odd, the median is just the middle value of 1.2.

23 Per Capita CO 2 Emissions Without the outlier: (n = 7) tons/yr : Posn. : With n odd, the median is just the middle value of 1.2.

24 Per Capita CO 2 Emissions The first quartile Q 1 or 25th percentile is defined to be the median of the bottom half of our data.

25 Per Capita CO 2 Emissions The first quartile Q 1 or 25th percentile is defined to be the median of the bottom half of our data. For a data set of odd sample size, we do not include the median in the bottom half:

26 Per Capita CO 2 Emissions The first quartile Q 1 or 25th percentile is defined to be the median of the bottom half of our data. For a data set of odd sample size, we do not include the median in the bottom half: Without the outlier: (n = 7) tons/yr : Posn. : Q 1 =.7

27 Per Capita CO 2 Emissions The first quartile Q 1 or 25th percentile is defined to be the median of the bottom half of our data. For a data set of odd sample size, we do not include the median in the bottom half: Without the outlier: (n = 7) tons/yr : Posn. : Q 1 =.7

28 Per Capita CO 2 Emissions The first quartile Q 1 or 25th percentile is defined to be the median of the bottom half of our data. For a data set of odd sample size, we do not include the median in the bottom half: Without the outlier: (n = 7) tons/yr : Posn. : Q 1 =.7 (And Q 3 = 2.3.) The convention about not including the middle changed in the 3rd edition of (a sibling by the same authors of) our text.

29 Per Capita CO 2 Emissions For the original data set: (n = 8) tons/yr : Posn. : Q 1 = =.9 and Q 3 = = 6.05

30 Per Capita CO 2 Emissions The 5-number summary: min, Q 1, median, Q 3, max all 8.2,.9, 1.5, 6.05, 19.7 w/o US.2,.7, 1.2, 2.3, 9.8

31 Per Capita CO 2 Emissions The 5-number summary: min, Q 1, median, Q 3, max all 8.2,.9, 1.5, 6.05, 19.7 w/o US.2,.7, 1.2, 2.3, 9.8 Boxplot - Graphical form of the 5 number summary: all - n=8

32 Per Capita CO 2 Emissions The 5-number summary: min, Q 1, median, Q 3, max all 8.2,.9, 1.5, 6.05, 19.7 w/o US.2,.7, 1.2, 2.3, 9.8 Boxplot - Graphical form of the 5 number summary: Without the Outlier - n=7

33 Per Capita CO 2 Emissions Interquartile Range: IQR=Q 3 Q 1. A basic measure of spread.

34 Per Capita CO 2 Emissions Interquartile Range: IQR=Q 3 Q 1. A basic measure of spread. The median and IQR are usually little affected by outliers. Resistant to

35 Per Capita CO 2 Emissions Interquartile Range: IQR=Q 3 Q 1. A basic measure of spread. The median and IQR are usually little affected by outliers. Resistant to n Q 1 med Q 3 IQR Here: with outlier w/o outlier It is mostly because of the small sample size that the median and IQR change as much as they do here.

36 Per Capita CO 2 Emissions Spreadsheets do wild things when computing quartiles: tons/yr : Open Office, an Excel Clone One way to get such numbers: With 8 numbers there are 7 intervals in between..7 is the %ile. 1.1 is the %ile. 25 = So the 25th %ile is = 1.

37 Distribution of Quantitative Data To write a verbal description: Comment on the shape, center, spread, and any unusual features of the distribution.

38 Distribution of Quantitative Data To write a verbal description: Comment on the shape, center, spread, and any unusual features of the distribution. Histogram of all but US. (n=7) Per Capita CO 2 Emissions: n Q 1 med Q 3 IQR with outlier w/o outlier

39 Distribution of Quantitative Data Histogram of all but US. (n=7) Per Capita CO 2 Emissions: n Q 1 med Q 3 IQR with outlier w/o outlier

40 Distribution of Quantitative Data The distribution of per capita CO 2 emissions of the 8 most populous countries is skewed to the right with a center somewhere between 1 and 2 tons per year. The U.S. and possibly Russia are outliers on the big side. The interquartile range of 1.6 tons per year reflects the fact that aside from the two biggest producers, the six other countries produce less than 2.3 tons per person per year.

41 Distribution of Quantitative Data The distribution of per capita CO 2 emissions of the 8 most populous countries is skewed to the right with a center somewhere between 1 and 2 tons per year. The U.S. and possibly Russia are outliers on the big side. The interquartile range of 1.6 tons per year reflects the fact that aside from the two biggest producers, the six other countries produce less than 2.3 tons per person per year. Note skewed to the right or skewed positive, means stretched out towards the right; not the convention most newcomers find instinctive.

42 Forbes 790 CEO Salaries 1994

43 Forbes 790 CEO Salaries 1994

44 Forbes CEO Salaries 1994 Boxplot

45 Forbes All But Top 9 CEO Salaries 1994

46 Forbes All But Top 9 CEO Salaries 1994

47 Forbes CEO Salaries 1994 Boxplot

48 Comparing Measures of Center n Mean x Median All M 1.304M Without top M 1.296M

49 Comparing Measures of Center n Mean x Median All M 1.304M Without top M 1.296M Means heavily affected by outliers.

50 Comparing Measures of Center n Mean x Median All M 1.304M Without top M 1.296M Means heavily affected by outliers. Medians resistant to outliers.

51 Comparing Measures of Spread n Std. Dev.s IQR All M 1.731M Without top M 1.662M

52 Comparing Measures of Spread n Std. Dev.s IQR All M 1.731M Without top M 1.662M Std. Dev heavily affected by outliers.

53 Comparing Measures of Spread n Std. Dev.s IQR All M 1.731M Without top M 1.662M Std. Dev heavily affected by outliers. IQR resistant to outliers.