This is a quick-and-dirty example for some syntax and output from pscore and psmatch2.

Similar documents
Guideline on evaluating the impact of policies -Quantitative approach-

********************************************************************************************** *******************************

Application: Effects of Job Training Program (Data are the Dehejia and Wahba (1999) version of Lalonde (1986).)

* STATA.OUTPUT -- Chapter 5

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis

(LDA lecture 4/15/08: Transition model for binary data. -- TL)

Topics in Biostatistics Categorical Data Analysis and Logistic Regression, part 2. B. Rosner, 5/09/17

Chapter 2 Part 1B. Measures of Location. September 4, 2008

All analysis examples presented can be done in Stata 10.1 and are included in this chapter s output.

Bios 312 Midterm: Appendix of Results March 1, Race of mother: Coded as 0==black, 1==Asian, 2==White. . table race white

for var trstprl trstlgl trstplc trstplt trstep: reg X trust10 stfeco yrbrn hinctnt edulvl pltcare polint wrkprty

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Categorical Data Analysis

Unit 5 Logistic Regression Homework #7 Practice Problems. SOLUTIONS Stata version

Never Smokers Exposure Case Control Yes No

Table. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use "k:mydirectory,

Multilevel/ Mixed Effects Models: A Brief Overview

The study obtains the following results: Homework #2 Basics of Logistic Regression Page 1. . version 13.1

Sociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian. Preliminary Data Screening

Biostatistics 208 Data Exploration

İnsan Tunalı November 29, 2018 Econ 511: Econometrics I. ANSWERS TO ASSIGNMENT 10: Part II STATA Supplement

COMPARING MODEL ESTIMATES: THE LINEAR PROBABILITY MODEL AND LOGISTIC REGRESSION

Example Analysis with STATA

Example Analysis with STATA

PubHlth Introduction to Biostatistics. 1. Summarizing Data Illustration: STATA version 10 or 11. A Visit to Yellowstone National Park, USA

Group Comparisons: Using What If Scenarios to Decompose Differences Across Groups

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011

Week 10: Heteroskedasticity

EFA in a CFA Framework

Dealing with missing data in practice: Methods, applications, and implications for HIV cohort studies

Trunkierte Regression: simulierte Daten

Nested or Hierarchical Structure School 1 School 2 School 3 School 4 Neighborhood1 xxx xx. students nested within schools within neighborhoods

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Module 2.5: Difference-in-Differences Designs

Survey commands in STATA

Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS

Notes on PS2

Stata v 12 Illustration. One Way Analysis of Variance

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

Introduction of STATA

CHAPTER 6 ASDA ANALYSIS EXAMPLES REPLICATION SAS V9.2

Impact Of Rural Development Policy And Less Favored Areas Scheme: A Difference In Difference Matching Approach

The Impact of Farmer Field School Training on Net Crop Income: Evidence from Smallholder Maize Farmers in Oromia, Ethiopia

Tabulate and plot measures of association after restricted cubic spline models

Network-Based Job Search

Working with Stata Inference on proportions

This example demonstrates the use of the Stata 11.1 sgmediation command with survey correction and a subpopulation indicator.

Foley Retreat Research Methods Workshop: Introduction to Hierarchical Modeling

Interactions made easy

SUGGESTED SOLUTIONS Winter Problem Set #1: The results are attached below.

Correlated Random Effects Panel Data Models

Mixed Mode Surveys in Business Research: A Natural Experiment. Dr Andrew Engeli March 14 th 2018

PSC 508. Jim Battista. Dummies. Univ. at Buffalo, SUNY. Jim Battista PSC 508

3. The lab guide uses the data set cda_scireview3.dta. These data cannot be used to complete assignments.

FOLLOW-UP NOTE ON MARKET STATE MODELS

Week 11: Collinearity

Propensity score matching with clustered data in Stata

ROBUST ESTIMATION OF STANDARD ERRORS

Examples of Using Stata v11.0 with JRR replicate weights Provided in the NHANES data set

Post-Estimation Commands for MLogit Richard Williams, University of Notre Dame, Last revised February 13, 2017

Soci Statistics for Sociologists

Center for Demography and Ecology

Checking the model. Linearity. Normality. Constant variance. Influential points. Covariate overlap

*STATA.OUTPUT -- Chapter 13

Longitudinal Data Analysis, p.12

You can find the consultant s raw data here:

Unit 2 Regression and Correlation 2 of 2 - Practice Problems SOLUTIONS Stata Users

Does Farmer Field School Training Improve Technical Efficiency? Evidence from Smallholder Maize Farmers in Oromia, Ethiopia

Survival analysis. Solutions to exercises

Technical note The treatment effect: Comparing the ESR and PSM methods with an artificial example By: Araar, A.: April 2015:

. *increase the memory or there will problems. set memory 40m (40960k)

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2014

What can a CIE tell us about the origins of negative treatment effects of a training programme

The Multivariate Regression Model

Compartmental Pharmacokinetic Analysis. Dr Julie Simpson

rat cortex data: all 5 experiments Friday, June 15, :04:07 AM 1

Experiment Outcome &Literature Review. Presented by Fang Liyu

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Integrating Solar and Net Zero Homes. EFG May 2016

Interpreting and Visualizing Regression models with Stata Margins and Marginsplot. Boriana Pratt May 2017

MULTIPLE IMPUTATION. Adrienne D. Woods Methods Hour Brown Bag April 14, 2017

Economic Benefits of the Wonderbag Cooking Technology: An Impact Assessment

Midterm Exam. Friday the 29th of October, 2010

Timing Production Runs

Farm level impact of rural development policy: a conditional difference in difference matching approach

Factor Analysis and Structural Equation Modeling: Exploratory and Confirmatory Factor Analysis

Appendix C: Lab Guide for Stata

Applied Econometrics

Journal of Biology, Agriculture and Healthcare ISSN (Paper) ISSN X (Online) Vol.6, No.23, 2016

MNLM for Nominal Outcomes

Determining PhD holders salaries in social sciences and humanities: impact counts

DAY 2 Advanced comparison of methods of measurements

+? Mean +? No change -? Mean -? No Change. *? Mean *? Std *? Transformations & Data Cleaning. Transformations

energy usage summary (both house designs) Friday, June 15, :51:26 PM 1

Who Are My Best Customers?

SAS/STAT 14.3 User s Guide The PSMATCH Procedure

Exploring Functional Forms: NBA Shots. NBA Shots 2011: Success v. Distance. . bcuse nbashots11

PROPENSITY SCORE MATCHING A PRACTICAL TUTORIAL

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa

The Multivariate Dustbin

Transcription:

This is a quick-and-dirty example for some syntax and output from pscore and psmatch2. It is critical that when you run your own analyses, you generate your own syntax. Both of these procedures have very good help files (and a Stata Journal article for pscore). It s easy to see what each of these commands and options does, and you ll likely want to adjust some options to assess the sensitivity of your results. Never use results from commands you don t understand!. pscore married w1male w3age w3white w3black w3natam w3asian w3hispan w3borncit w3lwmom > w3lwdad w3paratt_m w3paratt_f w3badhealth w3feelold w3feeladult w3marij w3othdrug w3drink w3binge w3evsmoke w3regsmoke w1nviolent w1violent w3anarrests w1victim w3victim w1gpa w1aspir w1expec w3highsch w3somecoll w3college w3postba w3studfull w3studpart w3hasjob w3jobstab w3milreserv w3milactive w3milever w3willmarry w3anyrel w3anyrelnum w3virgin w3firstsex w3birthcon w3rbirthcon w3condom w3rcondom w3pregnum /* > */, pscore(logit1) comsup numblo(5) level (0.001) blockid(block1) logit Algorithm to estimate the propensity score The treatment is married married Freq. Percent Cum. 0 13,377 79.75 79.75 1 3,397 20.25 100.00 Total 16,774 100.00 Estimation of the propensity score Iteration 0: log likelihood = -5485.472 Iteration 1: log likelihood = -4897.4988 Iteration 2: log likelihood = -4854.9867 Iteration 3: log likelihood = -4854.195 Iteration 4: log likelihood = -4854.1946 Logistic regression Number of obs = 10264 LR chi2(50) = 1262.55 Prob > chi2 = 0.0000 Log likelihood = -4854.1946 Pseudo R2 = 0.1151 ------------------------------------------------------------------------------ married Coef. Std. Err. z P> z [95% Conf. Interval] ------------------------------ w1male -.1918493.0566349-3.39 0.001 -.3028517 -.0808469 w3age.1168165.0176689 6.61 0.000.082186.151447 w3white.176504.1284245 1.37 0.169 -.0752034.4282114 w3black -.9351439.1411776-6.62 0.000-1.211847 -.6584409 w3natam -.043757.1294094-0.34 0.735 -.2973947.2098808 w3asian -.4113568.1461858-2.81 0.005 -.6978758 -.1248378

w3hispan -.2007916.0778879-2.58 0.010 -.3534491 -.0481341 w3borncit -.550219.3001274-1.83 0.067-1.138458.0380199 w3lwmom -.1559957.0767775-2.03 0.042 -.3064768 -.0055146 w3lwdad.0503012.0819374 0.61 0.539 -.1102932.2108956 w3paratt_m.0191287.0073715 2.59 0.009.0046808.0335766 w3paratt_f.0252506.0105785 2.39 0.017.0045172.045984 w3badhealth -.0499224.0317887-1.57 0.116 -.112227.0123823 w3feelold.0386948.0322794 1.20 0.231 -.0245716.1019612 w3feeladult.0627364.0271365 2.31 0.021.0095498.115923 w3marij -.3239558.0672887-4.81 0.000 -.4558392 -.1920724 w3othdrug -.1812095.1032346-1.76 0.079 -.3835456.0211267 w3drink -.0786059.0231718-3.39 0.001 -.1240217 -.03319 w3binge -.0272923.0254246-1.07 0.283 -.0771237.0225391 w3evsmoke -.1133643.074764-1.52 0.129 -.259899.0331705 w3regsmoke -.0234748.0857581-0.27 0.784 -.1915575.144608 w1nviolent -.0196366.0201483-0.97 0.330 -.0591266.0198535 w1violent -.0178439.0318575-0.56 0.575 -.0802835.0445957 w3anarrests -.0731794.0486501-1.50 0.133 -.1685319.0221731 w1victim.0334187.0383416 0.87 0.383 -.0417295.1085669 w3victim -.0297907.0458016-0.65 0.515 -.1195602.0599787 w1gpa.0658452.0404899 1.63 0.104 -.0135136.145204 w1aspir -.0467139.0374636-1.25 0.212 -.1201412.0267133 w1expec.0467394.0350962 1.33 0.183 -.022048.1155267 w3highsch.0607136.1008166 0.60 0.547 -.1368833.2583105 w3somecoll.0249917.1097448 0.23 0.820 -.1901042.2400876 w3college.2404832.1312462 1.83 0.067 -.0167547.4977211 w3postba.2201523.15622 1.41 0.159 -.0860333.526338 w3studfull.0473986.0704989 0.67 0.501 -.0907766.1855739 w3studpart -.0288639.0920649-0.31 0.754 -.2093079.1515801 w3hasjob.2716228.0693612 3.92 0.000.1356774.4075682 w3jobstab.1435788.1081439 1.33 0.184 -.0683794.3555369 w3milreserv.0306402.1881574 0.16 0.871 -.3381415.399422 w3milactive.5259571.202558 2.60 0.009.1289506.9229635 w3milever.4027751.193226 2.08 0.037.0240591.7814911 w3willmarry -.4840388.0285845-16.93 0.000 -.5400633 -.4280143 w3anyrel.1897455.0802232 2.37 0.018.032511.34698 w3anyrelnum.0044024.0094108 0.47 0.640 -.0140424.0228472 w3virgin -.3562088.0964476-3.69 0.000 -.5452426 -.1671749 w3firstsex -.0176317.0135987-1.30 0.195 -.0442846.0090212 w3birthcon.0362381.0265115 1.37 0.172 -.0157235.0881997 w3rbirthcon.1617378.0847088 1.91 0.056 -.0042885.3277641 w3condom -.0584226.027779-2.10 0.035 -.1128685 -.0039767 w3rcondom -.1620617.0857905-1.89 0.059 -.330208.0060845 w3pregnum -.0082511.0374689-0.22 0.826 -.0816889.0651866 _cons -3.260862.5315544-6.13 0.000-4.302689-2.219034 ------------------------------------------------------------------------------ Note: the common support option has been selected The region of common support is [.00563074,.78917042] Description of the estimated propensity score in region of common support Estimated propensity score -------------------------------------------------------------

Percentiles Smallest 1%.018827.0056307 5%.0388676.0060989 10%.0602847.0065916 Obs 11354 25%.1136353.0066464 Sum of Wgt. 11354 50%.2019286 Mean.2299126 Largest Std. Dev..145665 75%.3265931.7585653 90%.4407855.7675806 Variance.0212183 95%.5066688.7880642 Skewness.6949197 99%.6128524.7891704 Kurtosis 2.835612 ** Step 1: Identification of the optimal number of blocks Use option detail if you want more detailed output ** The final number of blocks is 8 This number of blocks ensures that the mean propensity score is not different for treated and controls in each blocks ****** Step 2: Test of balancing property of the propensity score Use option detail if you want more detailed output ****** Variable w3firstsex is not balanced in block 1 The balancing property is not satisfied Try a different specification of the propensity score pscore tells you exactly which variables failed to balance. You ll modify your model to achieve balance on observed covariates. Inferior of block married of pscore 0 1 Total 0 2,127 152 2,279.1 1,351 177 1,528.15 1,169 232 1,401.2 1,609 521 2,130.3 975 507 1,482.4 479 414 893.5 185 242 427.6 45 75 120 Total 7,940 2,320 10,260 Note: the common support option has been selected ******************************************* End of the algorithm to estimate the pscore *******************************************

Running pscore again, having respecified the propensity score model as much as needed.... pscore married covariates covariates covariates /* > */, pscore(logit2) comsup numblo(5) level (0.001) blockid(block1) logit Algorithm to estimate the propensity score The treatment is married married Freq. Percent Cum. 0 13,377 79.75 79.75 1 3,397 20.25 100.00 Total 16,774 100.00 Estimation of the propensity score Iteration 0: log likelihood = -5499.6148 Iteration 1: log likelihood = -4909.949 Iteration 2: log likelihood = -4867.2317 Iteration 3: log likelihood = -4866.4337 Iteration 4: log likelihood = -4866.4333 Logistic regression Number of obs = 10300 LR chi2(49) = 1266.36 Prob > chi2 = 0.0000 Log likelihood = -4866.4333 Pseudo R2 = 0.1151 ------------------------------------------------------------------------------ married Coef. Std. Err. z P> z [95% Conf. Interval] ------------------------------ w1male -.1956318.0565382-3.46 0.001 -.3064447 -.084819 etc etc etc _cons -3.47201.5002261-6.94 0.000-4.452435-2.491585 ------------------------------------------------------------------------------ Note: the common support option has been selected The region of common support is [.00574559,.78324625] Description of the estimated propensity score in region of common support Estimated propensity score ------------------------------------------------------------- Percentiles Smallest

1%.0186865.0057456 5%.0392789.0058879 10%.0598639.0067045 Obs 11393 25%.1134345.0067167 Sum of Wgt. 11393 50%.2011893 Mean.2294791 Largest Std. Dev..1454779 75%.3266644.7415157 90%.4404727.7495315 Variance.0211638 95%.5062561.7682616 Skewness.6943207 99%.6122316.7832463 Kurtosis 2.831798 ** Step 1: Identification of the optimal number of blocks Use option detail if you want more detailed output ** The final number of blocks is 8 This number of blocks ensures that the mean propensity score is not different for treated and controls in each blocks ****** Step 2: Test of balancing property of the propensity score Use option detail if you want more detailed output ****** The balancing property is satisfied Success! This table shows the inferior bound, the number of treated and the number of controls for each block Inferior of block married of pscore 0 1 Total.0057456 2,144 150 2,294.1 1,374 182 1,556.15 1,154 240 1,394.2 1,620 519 2,139.3 975 507 1,482.4 476 413 889.5 186 239 425.6 42 74 116 Total 7,971 2,324 10,295 Note: the common support option has been selected ******************************************* End of the algorithm to estimate the pscore *******************************************

. psgraph, treated(married) pscore(logit2) bin(50) Note: the y axis in psgraph is proportional by group the treated and untreated are not necessarily on the same scale.

. attnd w4anycrime married, pscore(logit2) comsup dots detail [Or: attnd w4anycrime married covariates covariates covariates, logit comsup dots detail to find a propensity score, match, and get estimates all in one command.] But remember: it s better to go one step at a time! ************ Estimation of the ATT with the nearest neighbor matching method Random draw version ************ Note: the common support option has been selected The region of common support is [.00574559,.78324625] The outcome is w4anycrime Variable Obs Mean Std. Dev. Min Max ---------------------- w4anycrime 9824.1673453.3733029 0 1 The treatment is married married Freq. Percent Cum. 0 7,971 77.43 77.43 1 2,324 22.57 100.00 Total 10,295 100.00 The distribution of the pscore is Estimated propensity score ------------------------------------------------------------- Percentiles Smallest 1%.0186865.0057456 5%.0392789.0058879 10%.0598639.0067045 Obs 11393 25%.1134345.0067167 Sum of Wgt. 11393 50%.2011893 Mean.2294791 Largest Std. Dev..1454779 75%.3266644.7415157 90%.4404727.7495315 Variance.0211638 95%.5062561.7682616 Skewness.6943207 99%.6122316.7832463 Kurtosis 2.831798 The program is searching the nearest neighbor of each treated unit. This operation may take a while. Forward search (233 missing values generated) Backward search (198 missing values generated) Choice between backward or forward match

Display of final results The number of treated is 2324 The number of treated which have been matched is 2324 Average absolute pscore difference between treated and controls Variable Obs Mean Std. Dev. Min Max ---------------------- PSDIF 2324.0001466.0007497 5.67e-08.0187302 Average outcome of the matched treated Variable Obs Mean Std. Dev. Min Max ---------------------- w4anycrime 2322.0865633.2812546 0 1 Average outcome of the matched controls Variable Obs Weight Mean Std. Dev. Min Max ------------------------------- w4anycrime 1282 2323.99993.1461187.353363 0 1 ATT estimation with Nearest Neighbor Matching method (random draw version) Analytical standard errors --------------------------------------------------------- n. treat. n. contr. ATT Std. Err. t --------------------------------------------------------- 2324 1282-0.060 0.013-4.533 --------------------------------------------------------- Note: the numbers of treated and controls refer to actual nearest neighbour matches ************************* End of the estimation with the nearest neighbor matching (random draw) method ************************* We can compare those matched results above to our unmatched full sample... but you d want to pay attention to the Ns.. sum w4anycrime if (married == 1) Variable Obs Mean Std. Dev. Min Max ---------------------- w4anycrime 3394.0936948.2914465 0 1. sum w4anycrime if (married == 0) Variable Obs Mean Std. Dev. Min Max ---------------------- w4anycrime 8606.2103184.4075584 0 1

psmatch2 doesn t give as much detail, but it does have more matching options, and it conveniently presents unmatched and matched comparisons side-by-side.. psmatch2 married, outcome(w4anycrime) pscore(logit2) neighbor(1) caliper(.001) common ---------------------------------------------------------------------------------------- Variable Sample Treated Controls Difference S.E. T-stat w4anycrime Unmatched.086563307.204088001 -.117524694.009072566-12.95 ATT.088287489.15350488 -.065217391.012758584-5.11 psmatch2: psmatch2: Common Treatment support assignment Off suppo On suppor Total Untreated 0 6,409 6,409 Treated 68 2,254 2,322 Total 68 8,663 8,731. psmatch2 married, outcome(w4anycrime) pscore(logit2) neighbor(1) caliper(.001) common noreplacement ---------------------------------------------------------------------------------------- Variable Sample Treated Controls Difference S.E. T-stat w4anycrime Unmatched.086563307.204088001 -.117524694.009072566-12.95 ATT.09382716.154567901 -.060740741.010323337-5.88 Note: S.E. does not take into account that the propensity score is estimated. psmatch2: psmatch2: Common Treatment support assignment Off suppo On suppor Total Untreated 0 6,409 6,409 Treated 297 2,025 2,322 Total 297 8,434 8,731. psmatch2 married, outcome(w4anycrime) pscore(logit2) neighbor(1) caliper(.0001) common ---------------------------------------------------------------------------------------- Variable Sample Treated Controls Difference S.E. T-stat w4anycrime Unmatched.086563307.204088001 -.117524694.009072566-12.95 ATT.102809325.160191273 -.057381949.013334398-4.30 Note: S.E. does not take into account that the propensity score is estimated. psmatch2: psmatch2: Common Treatment support assignment Off suppo On suppor Total Untreated 0 6,409 6,409 Treated 649 1,673 2,322 Total 649 8,082 8,731

. psmatch2 married, outcome(w4anycrime) pscore(logit2) kernel common ---------------------------------------------------------------------------------------- Variable Sample Treated Controls Difference S.E. T-stat w4anycrime Unmatched.086563307.204088001 -.117524694.009072566-12.95 ATT.086787565.15045892 -.063671355.009256977-6.88 Note: S.E. does not take into account that the propensity score is estimated. psmatch2: psmatch2: Common Treatment support assignment Off suppo On suppor Total Untreated 0 6,409 6,409 Treated 6 2,316 2,322 Total 6 8,725 8,731. psmatch2 married, outcome(w4anycrime) pscore(logit2) kernel common ---------------------------------------------------------------------------------------- Variable Sample Treated Controls Difference S.E. T-stat w4anycrime Unmatched.086563307.204088001 -.117524694.009072566-12.95 ATT.086787565.15045892 -.063671355.009256977-6.88 Note: S.E. does not take into account that the propensity score is estimated. psmatch2: psmatch2: Common Treatment support assignment Off suppo On suppor Total Untreated 0 6,409 6,409 Treated 6 2,316 2,322 Total 6 8,725 8,731