Thinking About Chance & Probability Models

Size: px
Start display at page:

Download "Thinking About Chance & Probability Models"

Transcription

1 Thinking About Chance & Probability Models Chapters 17 and 18 November 14, 2012 Chance Processes Randomness and Probability Probability Models Probability Rules Using Probability Models for Inference How Probability Works

2 1.0 Chance Processes People talk loosely about chance all the time. For scientific purposes, chance has a precise meaning. Chance behavior is unpredictable in the short run but has a regular and predictable pattern in the long run. This definition pre-supposes that the process can be repeated over and over again, independently and under the same conditions. Many games fall into this category.

3 1.1 Heads or tails? One simple game of chance involves betting on the toss of a coin. The result cannot be predicted in advance. But there is still a regular pattern in the results: proportion of heads seen in the long run. The following figure shows the results of tossing a coin 1,000 times: HTTHHHT...

4 1.2 Random Sampling Choose a S.R.S. of size n from a population. Suppose the unknown parameter is p. Then ˆp converges to p for large n. (Law of large numbers) Then the distribution of the sample estimate ˆp converges to a normal density curve in the long run. p

5 2.0 Randomness and Probability A variable is called a random variable if its values are uncertain but there is some long run stability in the process underlying the data. Observational units? Variable? protagonist helper hinderer

6 2.1 Example Variable X: Whether an infant chose the helper or hinderer. Each infant is like one repetition of the experiment, presumably under identical conditions. If we assume that each infant is making the planned use of chance in their selection, then X is a random variable. This is called choosing at random. Then the proportion of infants who choose the helper (say) will stabilize in the long run. This long run stability point is then called the probability of choosing the helper.

7 3.0 Probability Models A probability model for a random variable describes the possible outcomes and says how to assign probabilities to them. Random variable X: Kind of toy selected. Table: Probability Model for X X Helper Hinderer Probability p (1-p)

8 3.1 Probability Model for S.R.S Suppose we take a S.R.S. of size n from a population. Let p be the unknown proportion of individuals from the population who share a certain characteristic. Let ˆp be the proportion of individuals in our sample who share this characteristic. Then the Normal density curve assigns probabilities to the possible values of ˆp upon repeated sampling so long as n is large. The mean of this density is p. The standard p(1 p) deviation is. n

9 4.0 Some Probability Rules A Any probability is a number between 0 and 1. For any outcome A, the probability P(A) satisfies 0 P(A) 1. B All possible outcomes together must have probability 1. C The probability that an outcome does not occur is 1 minus the probability that it does occur. D The probability that one or the other outcome occurs is the sum of their individual probabilities.

10 4.1 Example Ex 18.8: Select a first year college student at random and ask what their academic rank was in high school. Here is the probability model for the random variable X = academic rank of student. Rank Top Second Third Fourth Lowest 20% 20% 20% 20% 20% Prob Check that this is valid probability model. 2. What is the probability that a randomly chosen first year college student was not in the top 20%? 3. What is the probability that a randomly chosen first year college student was in the top 40%?

11 4.2 More Probability Rules Multiplication Rule: The chance that two things will both happen equals the chance that the first will happen, multiplied by the chance that the second will happen given the first has happened. Two things are independent if the chances for the second given the first are the same, no matter how the first turns out. Otherwise they are dependent. Addition Rule: To find the chance that at least one of two things will happen, check to see if they are mutually exclusive. If they are, add the chances. Two things are mutually exclusive when the occurrence of one prevents the occurrence of the other; one excludes the other.

12 5.0 Lotsa Distributions (from Prof. June Morita) = sampling unit POPULATION desired information parameter numerical fact re. population SAMPLE approximation/estimate statistic estimates the parameter We want to estimate the proportion of individuals in a population with a certain characteristic call it p. Draw an S.R.S. of size n individuals from the population. Compute the sample proportion ˆp. Use it to estimate p.

13 5.1 Population Distribution Definition The population distribution of a variable is the distribution of the values of the variable among all individuals in the population. Example: Elway Research conducted a poll of 408 randomly selected Washingtonians and found that 64% (or 261) of their sample said they would support a temporary sales tax hike. Write the population distribution for the variable: would you support a sales tax hike? (yes/no). (All possible values for the variable and % of time it occurs in the population.)

14 5.2 Distribution of Sample Values Definition The distribution of the values in the sample shows all possible values seen for the variable in our sample and how often those values occured. Example: Elway Research conducted a poll of 408 randomly selected Washingtonians and found that 64% (or 261) of their sample said they would support a temporary sales tax hike. Write the distribution of the sample values for the variable: would you support a sales tax hike? (yes/no).

15 5.3 Sampling Distribution of Statistic Definition The sampling distribution of a statistic shows the distribution of values of the statistic in all possible samples of size n drawn from the population. Example: Elway Research conducted a poll of 408 randomly selected Washingtonians and found that 64% (or 261) of their sample said they would support a temporary sales tax hike. Write the sampling distribution of their statistic ˆp? For large sample sizes, ˆp tends to be close to the parameter p. (law of large numbers) And as the sample size increases, the sampling distribution for ˆp looks more and more like a normal distribution. The mean of this density is p. The standard deviation is p(1 p). (central limit theorem) n

16 5.4 Compare and Contrast The population distribution describes the individuals that make up the population. versus The distribution of the values in the sample describes the individuals that make up the sample; versus The sampling distribution describes how a statistic varies from one sample to another, all drawn from the same population.

17 5.4 Example You ask an S.R.S of 1,500 college students whether they applied for admission to any other college. Suppose 500 (33%) students in your sample admitted they had. 1. Population? Sample? Sample Size? 2. Variable? 3. Population distribution of variable? 4. Distribution of variable in sample? 5. What is the statistic (in words)? What is its value for this one sample? 6. What is the sampling distribution of this statistic?

18 6.0 Sampling Distribution As a Probability Model Sampling Distributions Are Probability Models. What does this mean exactly? It means Imagine having a box full of tickets. Each ticket has a value of ˆp on it. How many there are of each ˆp will depend on the sampling distribution. We mix up these tickets well and draw one at random. Then the probability of picking a ticket with a specific ˆp value will be determined by the area given to that value of ˆp under the sampling distribution. Check that the sampling distribution for ˆp from example 5.4 is a valid probability model.

19 6.1 Example You ask an S.R.S. of 1,500 college students whether they applied for admission to any other college. Suppose, in fact, 35% of all college students applied to colleges besides the ones they are attending. 1 What is the sampling distribution for ˆp? 2 What is the probability that if you select at random from this distribution, you will get a ˆp less than 0.33? 3 What is the probability that if you select at random from this distribution, you will get a ˆp less than 0.33 or greater than 0.35? 4 Give the range of the middle 95% of the ˆp values for this distribution. Interpret this range.

20 7.0 How Probability Works The probability of a head in tossing a fair coin is 0.5. This means that as we make more tosses, the proportion of heads will eventually get close to 0.5. This does not mean that the count of heads will get close to half the number of tosses. Why? Number of Proportion of Number of of Tosses of heads heads , , , ,704