ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011

Size: px
Start display at page:

Download "ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011"

Transcription

1 ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011 Instructions: Answer all five (5) questions. Point totals for each question are given in parentheses. The parts within each question receive equal weight. You may use a calculator. Be sure to show enough work so that we can see how you got your answer. 1. (15 points) For a random variable X, which is measured as a proportion and therefore takes on values in the interval 0,1 consider a class of cumulative distribution functions defined for a parameter 0: F x 0, x 0 F x x,0 x 1 F x 1, x 1 (i) Verify that F is in fact a valid CDF. Is it continuous? (A quick sketch of the function might help.) (ii) Derive a PDF for F for 0 x 1; the PDF can be set to zero for x 0,1. (iii) Show that E X can be written as E X 1. What happens to E X as increases? (iv) Given access to a random sample X i : i 1,...,n, use part (iii) to propose a consistent method-of-moments estimator of. How come the estimator is not unbiased? (v) Obtain the maximum likelihood estimator of. Does it differ from the method of moments estimator? 1

2 2. (20 points) Let X i : i 1,2,...,n be a random sample from a normal population with mean and variance Note that we must have 10 for this setup to make sense; in other words, the parameter space for is the set of values 10,. LetX denote the sample average and S 2 n 1 1 n i 1 X i X 2 be the sample variance. Consider two estimators of : W 1 X and W 2 S (i) Explain why W 2 is an unbiased estimator of. Does W 2 always produce a value in the parameter space? (ii) Find Var W 1 in terms of and the sample size. Does W 1 always produce a value in the parameter space? (iii) Show that Var W / n 1. [Hint: In general, we know from the chi-square distribution that Var n 1 S 2 / 10 2 n 1.] (iv) Show that there are values of in the parameter space such that when n 2, Var W 2 Var W 1. What do you conclude about whether X is a best unbiased estimator of in this setting? (v) How come the finding from part (iv) does not invalidate the general result that the sample mean, obtained as a random sample from a population with finite second moment, is the best linear unbiased estimator (BLUE) of? 2

3 3. (18 points) Provide an answer to each of the following questions, being sure to provide brief justification. (i) Two people each flip a fair coin 3 times. What is the probability that at least one of them gets three heads? (ii) Discuss the concepts of the population standard deviation, the sampling standard deviation, and the standard error when estimating the mean from a population with a finite second moment. (iii) The following equation with an interaction term was estimated using data on several hundred students in a principles of microeconomics course. The dependent variable is a standardized final exam score, attend is number of lectures attended (out of 30), and gpa is grade point average at the beginning of the term (measured on a four-point scale). score attend.50 gpa.05 attend gpa n 700, R 2.15 For a student with gpa 3, what is the estimated marginal effect of one more lecture attended? (iv) Answer agree or disagree with the following statement, and provide justification: If the errors in a multiple regression model suffer from heteroskedasticity, the OLS residuals will not sum to zero. (v) Answer agree or disagree with the following statement, and provide justification: Serial correlation in a time series regression needs to be dealt with only if we want more efficient estimation. (vi) Discuss the usefulness of the mean squared error for evaluating estimators. 3

4 4. (18 points) The Stata output below should be used to answer the following questions. The sample is of fast food restaurants in the Northeast, combined with economic and demographic information for the surrounding community. (i) In the simple regression of the price of soda on the proportion black, what effect does the proportion black have on the price of soda? In particular, if prpblack increases by.20, what is the estimated effect on psoda? Is it statistically significant? (ii) When the variables prppov, income, andhseval are added as controls to the OLS regression in part (i), what happens to the coefficient on prpblack? What happens to its statistical significance? (iii) Suppose someone comments that, because income and hseval are so highly correlated, they have no business being in the same equation. How would you respond? (iv) Based on the Stata output below, can you conclude that a model that assumes the elasticity of soda price with respect to income is constant is better than the model where psoda and income appear in level form? Explain. (v) In the model in which log psoda is the dependent variable, how do you interpret the coefficient on prpblck? For concreteness, consider a.20 increase in prpblck. (vi) Even after controlling for income, poverty rates, and housing values, why might the price of fast food be higher in communities with higher proportions of blacks in the absense of any racial discrimination? 4

5 . des psoda prpblck prppov income hseval lpsoda lincome lhseval storage display value variable name type format label variable label psoda float %9.0g price of medium soda, dollars prpblck float %9.0g proportion black, zipcode prppov float %9.0g proportion in poverty, zipcode income float %9.0g median family income, zipcode hseval float %9.0g median housing value, zipcode lpsoda float %9.0g log(psoda) lincome float %9.0g log(income) lhseval float %9.0g log(hseval). sum psoda prpblck prppov income hseval Variable Obs Mean Std. Dev. Min Max psoda prpblck prppov income hseval corr income hseval (obs 409) income hseval income hseval

6 . reg psoda prpblck Source SS df MS Number of obs F( 1, 399) 7.34 Model Prob F Residual R-squared Adj R-squared Total Root MSE.0881 psoda Coef. Std. Err. t P t [95% Conf. Interval prpblck _cons reg psoda prpblck prppov income hseval Source SS df MS Number of obs F( 4, 396) Model Prob F Residual R-squared Adj R-squared Total Root MSE.081 psoda Coef. Std. Err. t P t [95% Conf. Interval prpblck prppov income -2.44e e e e-07 hseval 1.02e e e e-06 _cons reg lpsoda prpblck prppov lincome lhseval Source SS df MS Number of obs F( 4, 396) Model Prob F Residual R-squared Adj R-squared Total Root MSE lpsoda Coef. Std. Err. t P t [95% Conf. Interval prpblck prppov lincome lhseval _cons

7 5. (20 points) Let y be an n 1 vector of dependent variables, with t th entry y t, and let X be an n k matrix of regressors. Consider the standard linear model written in matrix form: y X u, where is k 1anduis the n 1 vector of errors. You may treat X as nonrandom with rank k, and assume E u 0. (The alternative would be to allow X to be random with E u X 0.) (i) Let Z 1 and Z 2 also be n k nonrandom matrices such that Z 1 X is nonsingular. Define an estimator Z 1 X 1 Z 2 y. Show that is generally biased and compute its bias. (ii) Let be the usual OLS estimator and define a third estimator, I k Z 1 X 1 Z 2 X Show that this estimator is unbiased. (iii) Suppose the variance-covariance matrix of u is spherical, that is, Var u 2 I n.isit possible that Var is smaller than Var (in the matrix sense)? Explain. (iv) Let Ẑ 1 be the n k matrix of fitted values from regressing Z 1 on X, sothat Ẑ 1 X X X 1 X Z 1. Use the bias expression for to show that is unbiased if Z 2 Ẑ 1. 7