Statistics Chapter 3 Triola (2014)

3-1 Review and Preview Branches of statistics Descriptive Stats: is the branch of stats that involve the organization, summarization, and display of data Inferential Stats: is the branch of stats that involves using a sample to draw conclusions about a population. A basic tool in the study of inferential stats is probability (Larson & Farber, 2009). CVDOT Assignments for the week Monday and Tuesday 3-2 pg. 90 1-4 5-23 odd 29-31 odd Wednesday 3-3 pg. 106 5-23 odd 33, 35, 39 Thursday 3-4 pg. 123 1-31 odd Friday Activity On Wednesday February 17th Exam #1 over Ch 1,2, and 3 References Larson, R., & Farber, B. (2009). Elementary statistics: Picturing the world. (4th, Ed.) Upper Saddle River, NJ: Pearson Prentice Hall. Triola, M. F. (2014). Elementary Statistics (12th ed.). Boston, MA: Pearson. 1

Lesson 3-2 Measures of CENTER MEASURES of CENTRAL TENDENCY: is a value that represents a typical, or central, entry of a data set. The three most common used measures are the MEAN, MEDIAN and MODE MEAN: is the sum of the data entries divided by the number of entries Population Mean: The lower case Greek letter (mu) represents the population MEAN and the (x bar) represents the sample mean. NOTE that N represents the number of entries in a POPULATION and n represents the number of entries in a SAMPLE: The Greek Letter (sigma) indicates a summation of values Example 1: look at the data from the different chocolate chip cookies. For the Chips Ahoy regular cookies 22, 22, 26, 24, 23 Chips. NOW LOOK AT OUR COOKIES From Friday MEDIAN- is the value that lies in the middle of the data when the data is ordered. The meadian measures the CENTER of an ordered data set by dividing it into two equal parts. If the data set has an odd number of entires, the median is the middle one. If it is even the median is the mean of the two middle data entires Example 2: Now find the median of the chips 22, 22,26,24,23,27 MODE: the data entry that occurs most often. IF no entry is repeated the data set has no mode. If two entires occur withe the same frequency, each entry is a mode and the data is called bimodal Bimodal: Multimodal: No Mode: MidRange: the measure of center that is the value midway between the max and minvalue in the orginal data set 2

Part 2: Calculating the Mean from a frequency distribution We don't know the exact values of the data so have to make calculation possible by pretending that all samples values in each class are equal to the class midpoint Weighed Mean and Mean of grouped Data Sometimes data sets contain entires that have a greater effect on the mean than do other entires. Because of this you want to find the Weighted Mean: the mean of the data set whose entires have varying weights. w is the weight of each entry x 3

Lesson 3-3 Measures of Variation Part 1 Range, deviation, variance and standard deviation Range: the difference between the max and then min value Range= Max-min Deviation: of an entry x in a population data set is the difference between the entry and the mean of the data set Population standard deviation: of a population data set of N entries is the square root of the population variance (Larson & Farber, 2009) Sample Variance and sample standard deviation of a SAMPLE of data of n entires 4

3-3 Continued standard deviation: measurement of how much data values deviate away from the mean. PROPERTIES OF SD 1. The standard deviatoin is a measure of how much data values deviate away from the mean 2. The value of the standard Deviation s is usually positive. It is zero only when all of the data values are the same number. Larger values of s indicate greater amounts of variation 3. the value of s can increase dramatically with the inclusion of one or more outliers 4. The units of the SD s (such as minutes, feet, pounds, and so on) are the same as the units of the orginal data values 5. The sample SD s is a biased estimator of the population standard deviation as described in Part 2 of the section Home made store Bought 5 13 5 16 6 16 7 17 7 17 4 18 5 19 5 20 5 21 7 11 8 12 7 13 7 13 10 13 10 14 12 14 12 14 14 15 6 17 7 23 9 25 10 10 12 Goody Super 1 Number 24 21 Mean 7.9 16.2 Range 10 14 SD 2.76 3.71 5

3-3 Part 2 Empirical Rule Or (68-95-99.7) Rule for Data with a Bell-shaped Distribution Chebyshev's Theorem: Applies to all data, does not have to be bell-shaped like the data does in order to use the Empirical Rule Comparing variation in Different Populations: In Part 1 it was explained that when comparing variation in two different sets of data, the standard deviations should only be compared if the two sets of data use the same scale and units have the approximately the same mean. If the samples use different scales or measurement units, we can use the coefficient of variation 6

Stats216Chapter3 Notes.notebook 3 4 Measures of Relative standing and boxplots Part 1: Basics of z Scores, Percentiles, Quartiles, and boxplots 7

Stats216Chapter3 Notes.notebook 3 4 Measures of Relative standing and boxplots 8

Stats216Chapter3 Notes.notebook 3 4 Quartiles Box and Whisker Plot: an exploratory data analysis tool that highlights the important features of a data set IN order to graph you must know 1. The minimum 2. The First Quartile Q1 3. Median or Q2 4. The third Quartile Q3 5. The maximum 9

3 4 Comparing Box Plots Normal distribution: Heights from a simple random sample of women Skewed Distribution: Salaries (in thousands of dollars) of NCAA football coaches On Calculator 10

Attachments Stats216 Ch3 Triola.pdf Cookie Data 2015.xlsx HeartRate_2015.xlsx STAT216 Spring 2014.xlsx Empirical Rule.pdf