8.2.1 Conditions for Estimating p As always, inference is based on the sampling distribution of a statistic. We described the sampling distribution of a sample proportion p-hat in section 7.2. Here is a brief review of its important properties: Shape: If the sample size is large enough that both np and n(1 p) are at least 10 (Normal condition), the sampling distribution of p-hat is approximately Normal. Center: The mean is p. That is, the sample proportion p-hat is an unbiased estimator of the population proportion p. Spread: The standard deviation of the sampling distribution of p-hat is provided that the population is at least 10 times as large as the sample (10% condition). In practice, of course, we don t know the value of p. If we did, we wouldn t need to construct a confidence interval for it! So we cannot check whether np and n(1 p) are 10 or greater. In large samples, p-hat will be close to p. Therefore, we replace p by p-hat in checking the Normal condition. Example The Beads Checking conditions Mr. Vignoloini s AP Statistics class would like to do an activity where they estimate the actual proportion of beads in a large jar. There exists a jar with thousands of colored beads (either red or white) that is provided. The class wants to construct a confidence interval for the proportion p of red beads in the container. First they take a SRS of 251 beads and found 107 red beads ad 144 white beads. Check that the conditions for constructing a confidence interval for p are met.
CHECK YOUR UNDERSTANDING In each of the following settings, check whether the conditions for calculating a confidence interval for the population proportion p are met. 1. An AP Statistics class at a large high school conducts a survey. They ask the first 100 students to arrive at school one morning whether or not they slept at least 8 hours the night before. Only 17 students say Yes. 2. A quality control inspector takes a random sample of 25 bags of potato chips from the thousands of bags filled in an hour. Of the bags selected, 3 had too much salt.
8.2.2 Constructing a Confidence Interval for p We can use the general formula from section 8.1 to construct a confidence interval for an unknown population proportion p: statistic ± (critical value) (standard deviation of statistic) The sample proportion p-hat is the statistic we use to estimate p. When the Independent condition is met, the standard deviation of the sampling distribution of p-hat is: Since we don t know the value of p, we replace it with the sample proportion p-hat: This quantity is called the standard error (SE) of the sample proportion p-hat. It describes how close the sample proportion p-hat will be, on average, to the population proportion p in repeated SRSs of size n. Standard Error - When the standard deviation of a statistic is estimated from data, the result is called the standard error of the statistic. How do we get the critical value for our confidence interval? If the Normal condition is met, we can use a Normal curve. For the approximate 95% confidence intervals of section 8.1, we used a critical value of 2 based on the 68 95 99.7 rule for Normal distributions. We can get a more accurate critical value from the z-table or a calculator. As the figure below shows, the central 95% of the standard Normal distribution is marked off by two points, z* = 1.96 and z* = 1.96. We use the * to remind you that this is a critical value, not a standardized score that has been calculated from data. To find a level C confidence interval, we need to catch the central area C under the standard Normal curve.
Example 80% Confidence Finding a critical value Use a z-table to find the critical value z* for an 80% confidence interval. Assume that the Normal condition is met. You can also find the critical value using the command invnorm(0.9, 0, 1), which tells the calculator to find the z-value from the standard Normal curve that has area 0.9 to the left of it. Once we find the critical value z*, our confidence interval for the population proportion p is: Notice that we replaced the standard deviation of p-hat with the formula for its standard error. The resulting interval is sometimes called a one-sample z interval for a population proportion.
Example The Beads Calculating a confidence interval for p Mr. Vignolini s class took an SRS of beads from the container and got 107 red beads and 144 white beads. (a) Calculate and interpret a 90% confidence interval for p. (b) Mr. Vignolini claims that exactly half of the beads in the container are red. Use your result from (a) to comment on this claim.
CHECK YOUR UNDERSTANDING Alcohol abuse has been described by college presidents as the number one problem on campus, and it is an important cause of death in young adults. How common is it? A survey of 10,904 randomly selected U.S. college students collected information on drinking behavior and alcohol-related problems. The researchers defined frequent binge drinking as having five or more drinks in a row three or more times in the past two weeks. According to this definition, 2486 students were classified as frequent binge drinkers. 1. Identify the population and the parameter of interest. 2. Check conditions for constructing a confidence interval for the parameter. 3. Find the critical value for a 99% confidence interval. Show your method. Then calculate the interval. 4. Interpret the interval in context.
8.2.3 Putting it All Together: The Four-Step Process Example Teens Say Sex Can Wait Confidence interval for p The Gallup Youth Survey asked a random sample of 439 U.S. teens aged 13 to 17 whether they thought young people should wait to have sex until marriage. Of the sample, 246 said Yes. Construct and interpret a 95% confidence interval for the proportion of all teens who would say Yes if asked this question.
Remember that the margin of error in a confidence internal includes only sampling variability! There are other sources of error that are not taken into account. As is the case with many surveys, we are forced to assume that the teens answered truthfully. If they didn t, then our estimate may be biased. Other problems like nonresponse and question wording can also affect the results of this or any other poll. Lesson: Sampling beads is much easier than sampling people! AP EXAM TIP If a free-response question asks you to construct and interpret a confidence interval, you are expected to do the entire four step process. That includes clearly defining the parameter and checking conditions. AP EXAM TIP You may use your calculator to compute a confidence interval on the AP exam. But there s a risk involved. If you just give the calculator answer with no work, you ll get either full credit for the Do step (if the interval is correct) or no credit (if it s wrong). We recommend showing the calculation with the appropriate formula and then checking with your calculator. If you opt for the calculator-only method, be sure to name the procedure (e.g., one-proportion zinterval) and to give the interval (e.g., 0.514 to 0.606). Learn Confidence interval for a population proportion
8.2.4 Choosing the sample size In planning a study, we may want to choose a sample size that allows us to estimate a population proportion within a given margin of error. National survey organizations like the Gallup Poll typically sample between 1000 and 1500 American adults, who are interviewed by telephone. Why do they choose such sample sizes? The margin of error (ME) in the confidence interval for p is: Here, z* is the standard Normal critical value for the level of confidence we want. Because the margin of error involves the sample proportion of successes p-hat, we have to guess the latter value when choosing n. Here are two ways to do this: 1. Use a guess for p-hat based on a pilot study or on past experience with similar studies. You should do several calculations that cover the range of p-hat values you might get. 2. Use as the guess. The margin of error ME is largest when, so this guess is conservative in the sense that if we get any other p-hat when we do our study, we will get a margin of error smaller than planned. Once you have a guess for p-hat, the formula for the margin of error can be solved to give the sample size n needed.
Example Customer Satisfaction Determining sample size A company has received complaints about its customer service. The managers intend to hire a consultant to carry out a survey of customers. Before contacting the consultant, the company president wants some idea of the sample size that she will be required to pay for. One critical question is the degree of satisfaction with the company s customer service, measured on a five-point scale. The president wants to estimate the proportion p of customers who are satisfied (that is, who choose either satisfied or very satisfied, the two highest levels on the five-point scale). She decides that she wants the estimate to be within 3% (0.03) at a 95% confidence level. How large a sample is needed? Determine the sample size needed to estimate p within 0.03 with 95% confidence.
Why not round to the nearest whole number in this case, 1067? Because a smaller sample size will result in a larger margin of error, possibly more than the desired 3% for the poll. If you want a 2.5% margin of error rather than 3%, then: For a 2% margin of error, the sample size you need is: As usual, smaller margins of error call for larger samples. News reports frequently describe the results of surveys with sample sizes between 1000 and 1500 and a margin of error of about 3%. These surveys generally use sampling procedures more complicated than a simple random sample, so the calculation of confidence intervals is more involved than what we have studied in this section. The calculations of the previous example still give you a rough idea of how such surveys are planned.
CHECK YOUR UNDERSTANDING Refer to the previous example about the company s customer satisfaction survey. 1. In the company s prior-year survey, 80% of customers surveyed said they were satisfied or very satisfied. Using this value as a guess for p-hat, find the sample size needed for a margin of error of 3% at a 95% confidence level. 2. What if the company president demands 99% confidence instead? Determine how this would affect your answer to Question 1.