ROBUST ESTIMATION OF STANDARD ERRORS

Similar documents
Milk Data Analysis. 1. Objective: analyzing protein milk data using STATA.

Example Analysis with STATA

Example Analysis with STATA

(LDA lecture 4/15/08: Transition model for binary data. -- TL)

Foley Retreat Research Methods Workshop: Introduction to Hierarchical Modeling

. *increase the memory or there will problems. set memory 40m (40960k)

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis

İnsan Tunalı November 29, 2018 Econ 511: Econometrics I. ANSWERS TO ASSIGNMENT 10: Part II STATA Supplement

Bios 312 Midterm: Appendix of Results March 1, Race of mother: Coded as 0==black, 1==Asian, 2==White. . table race white

COMPARING MODEL ESTIMATES: THE LINEAR PROBABILITY MODEL AND LOGISTIC REGRESSION

Sociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian. Preliminary Data Screening

Guideline on evaluating the impact of policies -Quantitative approach-

Trunkierte Regression: simulierte Daten

for var trstprl trstlgl trstplc trstplt trstep: reg X trust10 stfeco yrbrn hinctnt edulvl pltcare polint wrkprty

Application: Effects of Job Training Program (Data are the Dehejia and Wahba (1999) version of Lalonde (1986).)

The Multivariate Regression Model

Notes on PS2

Multilevel/ Mixed Effects Models: A Brief Overview

* STATA.OUTPUT -- Chapter 5

Introduction of STATA

Unit 2 Regression and Correlation 2 of 2 - Practice Problems SOLUTIONS Stata Users

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2014

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011

Week 11: Collinearity

SUGGESTED SOLUTIONS Winter Problem Set #1: The results are attached below.

Week 10: Heteroskedasticity

17.871: PS3 Key. Part I

Soci Statistics for Sociologists

Correlated Random Effects Panel Data Models

Computer Handout Two

Lecture 2a: Model building I

ECON Introductory Econometrics Seminar 9

X. Mixed Effects Analysis of Variance

(R) / / / / / / / / / / / / Statistics/Data Analysis

ECON Introductory Econometrics Seminar 6

Topics in Biostatistics Categorical Data Analysis and Logistic Regression, part 2. B. Rosner, 5/09/17

Module 20 Case Studies in Longitudinal Data Analysis

Group Comparisons: Using What If Scenarios to Decompose Differences Across Groups

Compartmental Pharmacokinetic Analysis. Dr Julie Simpson

CHECKING INFLUENCE DIAGNOSTICS IN THE OCCUPATIONAL PRESTIGE DATA

Module 6 Case Studies in Longitudinal Data Analysis

Applied Econometrics

Table. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use "k:mydirectory,

Experiment Outcome &Literature Review. Presented by Fang Liyu

Stata v 12 Illustration. One Way Analysis of Variance

PSC 508. Jim Battista. Dummies. Univ. at Buffalo, SUNY. Jim Battista PSC 508

log: F:\stata_parthenope_01.smcl opened on: 17 Mar 2012, 18:21:56

Eco311, Final Exam, Fall 2017 Prof. Bill Even. Your Name (Please print) Directions. Each question is worth 4 points unless indicated otherwise.

Midterm Exam. Friday the 29th of October, 2010

Stata Program Notes Biostatistics: A Guide to Design, Analysis, and Discovery Second Edition Chapter 12: Analysis of Variance

Number of obs = R-squared = Root MSE = Adj R-squared =

Longitudinal Data Analysis, p.12

ADVANCED ECONOMETRICS I

Survey commands in STATA

A Little Stata Session 1

The Effect of Occupational Danger on Individuals Wage Rates. A fundamental problem confronting the implementation of many healthcare

You can find the consultant s raw data here:

Exploring Functional Forms: NBA Shots. NBA Shots 2011: Success v. Distance. . bcuse nbashots11

The SAS System 1. RM-ANOVA analysis of sheep data assuming circularity 2

Analyzing CHIS Data Using Stata

Florida. Difference-in-Difference Models 8/23/2016

Interactions made easy

Outliers Richard Williams, University of Notre Dame, Last revised April 7, 2016

Intro to Longitudinal Data Analysis using Stata (Version 10) I. Reading Data: use Read data that have been saved in Stata format.

The study obtains the following results: Homework #2 Basics of Logistic Regression Page 1. . version 13.1

Final Exam Spring Bread-and-Butter Edition

This is a quick-and-dirty example for some syntax and output from pscore and psmatch2.

(February draft)

Interpreting and Visualizing Regression models with Stata Margins and Marginsplot. Boriana Pratt May 2017

Checking the model. Linearity. Normality. Constant variance. Influential points. Covariate overlap

Chapter 5 Regression

Unit 5 Logistic Regression Homework #7 Practice Problems. SOLUTIONS Stata version

Elementary tests. proc ttest; title3 'Two-sample t-test: Does consumption depend on Damper Type?'; class damper; var dampin dampout diff ;

Read and Describe the SENIC Data

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Multiple Imputation and Multiple Regression with SAS and IBM SPSS

Categorical Data Analysis

Problem Points Score USE YOUR TIME WISELY SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

********************************************************************************************** *******************************

PREDICTIVE MODEL OF TOTAL INCOME FROM SALARIES/WAGES IN THE CONTEXT OF PASAY CITY

EFA in a CFA Framework

3. The lab guide uses the data set cda_scireview3.dta. These data cannot be used to complete assignments.

Nested or Hierarchical Structure School 1 School 2 School 3 School 4 Neighborhood1 xxx xx. students nested within schools within neighborhoods

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Modelling categorical variables using logit models

Tabulate and plot measures of association after restricted cubic spline models

DAY 2 Advanced comparison of methods of measurements

Biostatistics 208. Lecture 1: Overview & Linear Regression Intro.

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

Oneway ANOVA Using SAS (commands= finan_anova.sas)

Statistical Modelling for Business and Management. J.E. Cairnes School of Business & Economics National University of Ireland Galway.

All analysis examples presented can be done in Stata 10.1 and are included in this chapter s output.

Modeling Contextual Data in. Sharon L. Christ Departments of HDFS and Statistics Purdue University

PubHlth 640 Intermediate Biostatistics Unit 2 Regression and Correlation

Appendix C: Lab Guide for Stata

Do not turn over until you are told to do so by the Invigilator.

Methods for Multilevel Modeling and Design Effects. Sharon L. Christ Departments of HDFS and Statistics Purdue University

BIOSTATS 640 Spring 2017 Stata v14 Unit 2: Regression & Correlation. Stata version 14

Regression diagnostics

Factor Analysis and Structural Equation Modeling: Exploratory and Confirmatory Factor Analysis

Transcription:

ROBUST ESTIMATION OF STANDARD ERRORS -- log: Z:\LDA\DataLDA\sitka_Lab8.log log type: text opened on: 18 Feb 2004, 11:29:17. ****The observed mean responses in each of the 4 chambers; for 1988 and 1989. egen mean1=mean() if ch==1&<300, by() (892 missing values generated). egen mean2=mean() if ch==2&<300, by() (892 missing values generated). egen mean3=mean() if ch==3&<300, by() (967 missing values generated). egen mean4=mean() if ch==4&<300, by() (962 missing values generated). egen mean5=mean() if ch==1&>300, by() (811 missing values generated). egen mean6=mean() if ch==2&>300, by() (811 missing values generated). egen mean7=mean() if ch==3&>300, by() (931 missing values generated). egen mean8=mean() if ch==4&>300, by() (923 missing values generated). sort. graph7 mean*, c(llllllll) xlab ylab s(iiiiiiii) mean1 mean2 mean3 mean4 7 6 5 4 200 400 600 800. ****The observed mean responses in each of the two treatment groups. egen meanc1=mean() if o3==0&<300, by() (902 missing values generated)

. egen meanc2=mean() if o3==0&>300, by() (827 missing values generated). egen meant1=mean() if o3==1&<300, by() (757 missing values generated). egen meant2=mean() if o3==1&>300, by() (595 missing values generated). label var meanc1 "control-1988". label var meanc2 "control-1989". label var meant1 "O3-treated-1988". label var meant2 "O3-treated-1989". sort. graph7 meanc* meant*, c(llll) xlab ylab s(iiii) 7 control-1988 control-1989 O3-treated-1988 O3-treated-1989 6 5 4 200 400 600 800. *** Compute the REML estimates of the covariance matrix. tsset id panel variable: id, 1 to 79 time variable:, 152 to 674, but with gaps. ** Plotting ize over time, across chambers ;. ** two scatterplots;. graph7 if o3==0, ti("scatterplot of ize vs., in normal environ"). graph7 if o3==1, ti("scatterplot of ize vs., in ozone environ").

7.56 7.3 2.23 152 674 scatterplot of ize vs., in normal environ 2.79 152 674 scatterplot of ize vs., in ozone environ. ** two box plots;. sort. graph7 if o3==0, box by() ti("boxplot of ize vs time, in normal envi ron"). graph7 if o3==1, box by() ti("boxplot of ize vs time, in normal envi > ron") 7.56 7.3 2.23 152 174 201 227 258 469 496 528 556 579 Boxplot of ize vs time, in normal environ 613 639 674 2.79 152 174 201 227 258 469 496 528 556 579 Boxplot of ize vs time, in normal environ 613 639 674. ** Mean trend plot***;. xtgraph if (<300), group(o3) ti("mean ize vs, 1988") bar(se) > offset(1.5). xtgraph if (>300), group(o3) ti("mean ize vs, 1989") bar(se) > offset(1.5)

0 1 0 1 5.79111 6.62137 lo gs 3.97947 151.25 258.75 Mean ize vs, 1988 5.41692 468.25 674.75 Mean ize vs, 1989 ****Saturated model for the mean response, ignore chamber effect.. keep if <300 (632 observations deleted). anova o3. predict res, resid Number of obs = 395 R-squared = 0.3871 Root MSE =.628909 Adj R-squared = 0.3792 Source Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model 97.1719809 5 19.4343962 49.14 0.0000 93.3623074 4 23.3405769 59.01 0.0000 o3 3.80967344 1 3.80967344 9.63 0.0021 Residual 153.859874 389.395526667 -----------+---------------------------------------------------- Total 251.031854 394.637136687. ***The corresponding fitted means. predict yhat (option xb assumed; fitted values). keep id res. reshape group 152 174 201 227 258. reshape var res. reshape cons id. reshape wide. matrix accum Cov=res152 res174 res201 res227 res258, deviatio > ns noconstant (obs=79). scalar adj=1/(_result(1)-4). matrix Cov=adj*Cov. *REML estimate of covariance for 1988, P72. matrix list Cov symmetric Cov[5,5]

res152 res174 res201 res227 res258 res152.4500164 res174.40754288.39916499 res201.37601123.37498865.37307495 res227.37273896.37764627.37626807.40840599 res258.37131931.37765755.37499285.4087105.42080266. **NOw get the GLS beta(hatw). use "Z:\LDA\DataLDA\sitka.dta", clear. keep if <300 (632 observations deleted). tab, gen(t) Freq. Percent Cum. ------------+----------------------------------- 152 79 20.00 20.00 174 79 20.00 40.00 201 79 20.00 60.00 227 79 20.00 80.00 258 79 20.00 100.00 ------------+----------------------------------- Total 395 100.00. gen eta=1-o3. *covariate of time. gen gamma=/100*eta.. *We fit GEE marginal model with robust estimates of standard error. xtgee t1 t2 t3 t4 t5 eta gamma, noconst i(id) corr(exc) robust Iteration 1: tolerance = 6.450e-13 GEE population-averaged model Number of obs = 395 Group variable: id Number of groups = 79 Link: identity Obs per group: min = 5 Family: Gaussian avg = 5.0 Correlation: exchangeable max = 5 Wald chi2(6) = 6651.46 Scale parameter:.3881248 Prob > chi2 = 0.0000 (standard errors adjusted for clustering on id) Semi-robust Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- t1 4.060577.0790991 51.34 0.000 3.905546 4.215609 t2 4.470879.0776508 57.58 0.000 4.318686 4.623071 t3 4.842733.0773668 62.59 0.000 4.691097 4.994369 t4 5.178935.0820593 63.11 0.000 5.018102 5.339769 t5 5.31669.0838617 63.40 0.000 5.152325 5.481056 eta -.2216775.2429765-0.91 0.362 -.6979027.2545478 gamma.213851.0789394 2.71 0.007.0591327.3685694. *Incorrect standard error using OLS. reg t1 t2 t3 t4 t5 eta gamma, noconst Source SS df MS Number of obs = 395 -------------+------------------------------ F( 7, 388) = 3381.85

Model 9353.83562 7 1336.26223 Prob > F = 0.0000 Residual 153.309291 388.395127038 R-squared = 0.9839 -------------+------------------------------ Adj R-squared = 0.9836 Total 9507.14491 395 24.0687213 Root MSE =.62859 Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- t1 4.060577.07937 51.16 0.000 3.904528 4.216626 t2 4.470879.0756955 59.06 0.000 4.322054 4.619703 t3 4.842733.0739281 65.51 0.000 4.697383 4.988083 t4 5.178935.075257 68.82 0.000 5.030973 5.326898 t5 5.31669.0805032 66.04 0.000 5.158413 5.474967 eta -.2216775.3729256-0.59 0.553 -.9548853.5115304 gamma.213851.1811625 1.18 0.239 -.1423321.5700341. *Now create design matrix for 1988 data X. set matsize 400. *Now create design matrix for 1988 data X. mkmat t1 t2 t3 t4 t5 eta gamma, matrix(x). *Now create response matrix of Y for 1988 data. mkmat, matrix(y). *OLS estimate of beta (4.3.4) pg. 60. matrix beta=syminv(x'*x)*x'*y. matrix list beta beta[7,1] t1 4.0605772 t2 4.4708787 t3 4.8427332 t4 5.1789353 t5 5.3166904 eta -.22167746 gamma.21385103. *Calculate the robust estimate of standard error of beta (4.6.1). matrix diag=i(79). matrix V=diag#Cov. ***Get the estimated variance matrix (4.6.2) pg. 70. matrix RW=syminv(X'*X)*X'*V*X*syminv(X'*X). *Robust estimate of covariance matrix for the coefficients. matrix list RW symmetric RW[7,7] t1 t2 t3 t4 t5 eta t1.00815331 t2.00753765.00738623 t3.00704276.00702453.00699381 t4.00690913.00700455.00702797.00747413 t5.00678122.00694076.00700437.00752491.00778973 eta -.00946726 -.00850778 -.00733024 -.00619631 -.00484431.0508646 gamma.00112069.00065165.00007602 -.00047829 -.00113919 -.01378161 gamma gamma.006737

* Calculate the robust standard error. matrix var=vecdiag(rw). matrix var=var'. svmat var, name(varbeta). keep varbeta1. drop in 8/395 (388 observations deleted). gen se=sqrt(varbeta1). svmat beta, name(beta). mkmat beta1 se, matrix(coef). matrix Coef=Coef'. matrix list Coef Coef[2,7] r1 r2 r3 r4 r5 r6 beta1 4.0605774 4.4708786 4.8427334 5.1789355 5.3166904 -.22167745 se.0902957.08594321.08362897.08645306.08825944.2255318 r7 beta1.21385103 se.08207925. matrix colnames Coef= beta1 beta2 beta3 beta4 beta5 tao gamma. *Table4.3 for 1988 P75. matrix list Coef Coef[2,7] beta1 beta2 beta3 beta4 beta5 tao beta1 4.0605774 4.4708786 4.8427334 5.1789355 5.3166904 -.22167745 se.0902957.08594321.08362897.08645306.08825944.2255318 gamma beta1.21385103 se.08207925. log close log: Z:\LDA\DataLDA\sitka_Lab8.log log type: text closed on: 18 Feb 2004, 12:14:51 -----