Statistics Chapter Measures of Position LAB

Size: px
Start display at page:

Download "Statistics Chapter Measures of Position LAB"

Transcription

1 Statistics Chapter 2 Name: 2.5 Measures of Position LAB Learning objectives: 1. How to find the first, second, and third quartiles of a data set, how to find the interquartile range of a data set, and how to represent a data set graphically using a box-and whisker plot 2. How to interpret other fractiles such as percentiles and how to find percentiles for a specific data entry 3. Determine and interpret the standard score (z-score) Fractiles In this section, you will learn how to use fractiles to specify the location of a data entry within a data set. Fractiles are numbers that partition (divide) an ordered data set into equal parts. Quartiles First quartile, Q 1 : About one quarter of the data fall on or below Q 1. Second quartile, Q 2 : About one half of the data fall on or below Q 2 (median). Third quartile, Q 3 : About three quarters of the data fall on or below Q 3. Example: The number of nuclear power plants in the top 15 nuclear power-producing countries in the world are listed. Find the first, second, and third quartiles of the data set Solution: Q2 divides the data set into two halves. The first and third quartiles are the medians of the lower and upper halves of the data set.

2 Interquartile Range (IQR) The difference between the third and first quartiles. IQR = Q 3 Q 1. Recall that for the nuclear power plant data, Q 1 = 10, Q 2 = 18 and Q 3 = 31. The interquartile range is IQR = Q 3 Q 1 = = 21 The number of power plants in the middle portion of the data set vary by at most 21. Box-and-whisker plot Exploratory data analysis tool. Highlights important features of a data set. Requires (five-number summary): Minimum entry, First quartile Q 1, Median Q 2, Third quartile Q 3, Maximum entry How to draw a Box-and-Whisker Plot 1. Find the five-number summary of the data set. 2. Construct a horizontal scale that spans the range of the data. 3. Plot the five numbers above the horizontal scale. 4. Draw a box above the horizontal scale from Q 1 to Q 3 and draw a vertical line in the box at Q Draw whiskers from the box to the minimum and maximum entries. Example The five number summary for the power plant data is: Min = 6, Q 1 = 10, Q 2 = 18, Q 3 = 31, Max = 104 About half the scores are between 10 and 31. By looking at the length of the right whisker, you can conclude 104 is a possible outlier. Another way to identify outliers is to use the interquartile range.

3 Using the Interquartile Range to Identify Outliers 1. Find Q 1 and Q 3 2. Find the interquartile range, IQR = Q 3 Q 1 3. Multiply the IQR by Subtract (1.5) (IQR) from Q 1 to find the lower fence. lowerfence = Q (IQR) Any data entry less than the lower fence is an outlier. 5. Add 1.5 IQR to Q 3 to find the upper fence. upperfence = Q (IQR) Any data entry greater than the upper fence is an outlier. Example When carrying out this list of steps for the power plant data we find: 1. Q 1 = 10 and Q 3 = Find the interquartile range, IQR = Q 3 Q 1 = Multiply the IQR by 1.5: = lower fence = = There is no data entry less than 21.5 value. 5. upper fence = = So, 104 is an outlier since it is greater than Modified Boxplot The boxplot is redrawn with the whisker on the right extended out to the largest value in the data set that is not larger than the upper fence, namely data value 59. Had there been an outlier on the lower end, then our graph would have the left whisker extended to the lowest value in the data set that was not an outlier. Also, notice that outliers are marked on the graph with a cross marker. Your Turn! Use the sample data below answer the questions on the next page. The sample represents the number of paid vacation days used by 20 employees in a recent year

4 1. List the data in ascending fashion. 2. Find the minimum Find Q Find Q Find Q Find the maximum Find the lower fence Find the upper fence What numbers are outliers? Sketch the graph of a modified boxplot. Label the x axis. 11. About 75% of the employees in the sample took at least how many days off? What percentage of employees took more than 16 and a half days off? You randomly select one employee from the sample. What is the likelihood that the person took less than 20.5 days off? 13.

5 The ogive at the right represents the cumulative frequency distribution for SAT scores of collegebound students in a recent year. What SAT score represents the 62nd percentile? Answer/Interpretation: An SAT score of This means that approximately 62% of the students had an SAT score of 1600 or less. 14. What percentage of students scored at most 1500? 15. Approximately what SAT score represents the 90th percentile? How should you interpret this?

6 Another way to find a percentile is to use a formula. Definition To find the percentile that corresponds to a specific data entry x, use the formula number of data entries less than x Percentile of x = 100 total number of data entries and then round to the nearest whole number. Example days. For the vacation days data, find the percentile that corresponds to 8 vacation Answer There are two data entries less than 8 and the total number of data entries is 20. Percentile of 8 = number of data entries less than x total number of data entries 100 = = Interpretation 8 vacation days off corresponds to the 10th percentile. 8 vacation days is greater than 10% of the entries in the sample. Try These!: The exam scores of a particular geography class are listed Find the percentile that corresponds to an exam score of 67. How should you interpret this? 17. Find the percentile that corresponds to an exam score of 89. How should you interpret this? Standard Score (z-score) Every number x in a data set has a z-score A z-score represents the number of standard deviations a given value x is away from the mean µ. z-score formula z = x µ σ A z-score can be negative, positive or zero. When z is negative the corresponding x value is less than the mean When z is positive the corresponding x value is greater than the mean When z = 0 the corresponding x value is equal to the mean

7

8 How to use the calculator to graph a Boxplot The TI-83, TI-83 Plus, and TI-84 Plus calculators will take a list of data and automatically draw a box-and-whisker plot for that data. Since there are a number of different types of plots available on the calculator, it is important to make sure that all other plots are turned off before you begin or you graph will be cluttered with several unrelated plots being graphed at the same time. Even worse, it is possible for previous plots to become invalid or the data sets that were used before are changed or deleted, causing an error whenever you try to graph anything new. Before you begin any plotting: Press the Y= key at the top left of the keyboard, and delete or deselect any equations being plotted there. Press STAT PLOT (above the Y= key) and choose 4 to turn off any other statistical plots. There are three stages to creating a box-and-whisker plot. 1. Enter the data into a list as before. 2. Tell the calculator what kind of plot you want. 3. Tell the calculator what size to draw the window for the plot. Example: Household Incomes The following numbers represent household incomes in thousands of dollars for twelve households: We ll use this data to construct a box-and-whisker plot. First store the data in the list L1. Press the 2 nd key and then press the Y= key to get to STATPLOT. Press the number 1 key. Move the cursor to On and press the ENTER key. Use the arrow keys and highlight the box-and-whisker plot picture. Type L1 in for Xlist:

9 Set the Freq: to 1. Press the ZOOM key. Press the number 9 key. We have two different types of box-and-whisker plots to select from. One will separate outliers from the maximum or minimum value (the modified boxplot) and the other will include the outliers in to whiskers (the boxplot). If you select the picture that shows two dots after the maximum on the plot, the outliers will be shown outside of the whiskers.