********************************************************************************************** *******************************

Size: px
Start display at page:

Download "********************************************************************************************** *******************************"

Transcription

1 1 /* Workshop of impact evaluation MEASURE Evaluation-INSP, 2015*/ ********************************************************************************************** ******************************* DEMO: Propensity Score Matching Analysis *Description of the program: /*We want to evaluate the impact of a program on production abilities and the likelihood that women engage in farming or animal raising The program was implemented through the participation of various NGOs, who presented the project for funding Only women assigned to the NGOs were the potential beneficiaries to the program To evaluate the program we had a non-experimental design, which included all beneficiary women from the program Women in the comparison group were selected within the communities where women were participating in the intervention group and by interviewing their closest neighbor The treated/non-treated ratio was approximately 1:1*/ use demo_psm_sa, clear des id-nse3 storage display value variable name type format label variable label id str8 %9s (ID) woman state str2 %9s entidad federativa municipality float %90g group(entidad municip) locality float %90g group(entidad municip localid) program float %90g afirma program osc float %90g Societal Civil Organization osc1 byte %80g osc== osc2 byte %80g osc== osc3 byte %80g osc== osc4 byte %80g osc== osc5 byte %80g osc== osc6 byte %80g osc== osc7 byte %80g osc== osc8 byte %80g osc== osc9 byte %80g osc== osc10 byte %80g osc== age byte %90g agec float %130g agec categorical age agec1 byte %80g agec==age yrs agec2 byte %80g agec==age yrs agec3 byte %80g agec==age yrs agec4 byte %80g agec==age 50 o more kinship float %90g kin kinship1 byte %80g kinship==head kinship2 byte %80g kinship==wife kinship3 byte %80g kinship==other agehead byte %90g m02d05 age head of HH literacy byte %90g afirma Literacy: read & write spanish language byte %90g afirma speak indigenous language school float %170g school Schooling school1 byte %80g school==illiteracy school2 byte %80g school==primary school3 byte %80g school==secondary or more headschool float %170g school schooling head of HH headschool1 byte %80g headschool==illiteracy headschool2 byte %80g headschool==primary headschool3 byte %80g headschool==secondary or more farming byte %90g afirma self-employ in farm activities bluecollar byte %90g afirma work at factory selling byte %90g afirma work in selling products homework byte %90g afirma Work only at home oportunidades byte %90g afirma Beneficiary of Oportunidades program procampo byte %90g afirma (hh) beneficiary of Procampo program

2 2 loctime byte %110g m02d37 years living in the town voting byte %110g afirma If vote at presidential elections 2000 pregnancy float %90g afirma She was pregnant 1 year before children byte %90g afirma She have children children12 byte %90g afirma She have children<12 yrs HHsize byte %90g (hh) size child05 float %90g afirma (hh) with children 0-5 yrs pers517 float %90g afirma (hh) with persons 5-17 yrs pers60 float %90g afirma (hh) with persons 60 o more HHperswk float %90g work (hh) working persons hhwk1 byte %80g HHperswk== hhwk2 byte %80g HHperswk== hhwk3 byte %80g HHperswk== discap byte %190g afirma (hh) any person with disability foodexp float %90g (hh) monthly food expenditure totalexp float %90g (hh) monthly total expenditure oproductive float %90g afrima Participating in productive organizations before-program osocial float %90g afirma Participating in social organizations beforeprogram NSE float %90g Socio-economic level nse_st float %120g nse NSE stratified nse1 byte %80g nse_st==xxbajo nse2 byte %80g nse_st==xbajo nse3 byte %80g nse_st==bajo y medio /*Checking assignment to program*/ tab program program Freq Percent Cum No yes Total 1, *We suspect the presence of selection bias in the participation to the NGOs Therefore the women in the intervention and control groups could likely be not comparable in observable and unobservable characteristics We explore this hypothesis with the observed X*/ /*We check the distribution of the outcome variable (farming), by program status*/ tab program, sum(farming) Summary of self-employ in farm activities program Mean Std Dev Freq No yes Total logit farming program, cluster(locality) Iteration 0: log pseudolikelihood = Iteration 1: log pseudolikelihood = Iteration 2: log pseudolikelihood = Iteration 3: log pseudolikelihood = Logistic regression Number of obs = 1276 Wald chi2(1) = 2944 Prob > chi2 = Log pseudolikelihood = Pseudo R2 = (Std Err adjusted for 51 clusters in locality) Robust farming Coef Std Err z P> z [95% Conf Interval]

3 program _cons /*This would be the effect of the program if the assignment of the treatment was random*/ disp /*Due to the non experimantal design, we need to control for potential factors influencing participation in the program The challenge is to identify a comparison group as similar to the intervention group An alternative technique is the Propensity Score Matching method*/ /*STEP 1 Estimate the probability of participation to program using all the information, this is the PROPENSITY SCORE*/ probit program agec2 agec3 agec4 kinship2 kinship3 agehead language school2 school3 pregnancy oportunidades procampo oproductive osocial loctime voting children12 child05 pers517 pers60 HHsize hhwk2 hhwk3 discap nse2 nse3 osc2 osc3 osc4 osc5 osc6 osc7 osc8 osc9 osc10 munc2-munc33 (some output ommited) Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Iteration 4: log likelihood = Iteration 5: log likelihood = Iteration 6: log likelihood = Iteration 7: log likelihood = Iteration 8: log likelihood = Probit regression Number of obs = 1242 LR chi2(57) = Prob > chi2 = Log likelihood = Pseudo R2 = program Coef Std Err z P> z [95% Conf Interval] agec agec agec kinship kinship agehead language school school pregnancy oportunida~s procampo oproductive osocial loctime voting children child pers pers HHsize hhwk hhwk

4 4 discap nse nse osc osc osc osc osc osc osc osc osc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc munc _cons /*STEP 2: We obtain the propensity score by PREDICTING THE PROBABILITY OF PARTICIPATION*/ predict pscore if e(sample) (option pr assumed; Pr(program)) (59 missing values generated) /*In this example, we will use the kernel algorithm for doing the matching*/ psmatch2 program, pscore(pscore) kernel common bwidth(08) ***** NOTE: psmatch2 has to be downloaded from the internet and installed in your STATA copy /*BALANCED SAMPLE TESTS*/ pstest agec2 agec3 agec4 kinship2 kinship3 agehead language school2 school3 pregnancy oportunidades procampo oproductive osocial loctime voting children12 child05 pers517 pers60 HHsize hhwk2 hhwk3 discap NSE, support(_support) treated(program) summary both Mean %reduct t-test Variable Sample Treated Control %bias bias t p> t agec2 Unmatched Matched agec3 Unmatched Matched

5 agec4 Unmatched Matched kinship2 Unmatched Matched kinship3 Unmatched Matched agehead Unmatched Matched language Unmatched Matched school2 Unmatched Matched school3 Unmatched Matched pregnancy Unmatched Matched oportunida~s Unmatched Matched procampo Unmatched Matched oproductive Unmatched Matched osocial Unmatched Matched loctime Unmatched Matched voting Unmatched Matched children12 Unmatched Matched child05 Unmatched Matched pers517 Unmatched Matched pers60 Unmatched Matched HHsize Unmatched Matched hhwk2 Unmatched Matched hhwk3 Unmatched Matched discap Unmatched Matched NSE Unmatched Matched Summary of the distribution of the abs(bias) 5

6 6 BEFORE MATCHING Percentiles Smallest 1% % % Obs 25 25% Sum of Wgt 25 50% Mean Largest Std Dev % % Variance % Skewness % Kurtosis AFTER MATCHING Percentiles Smallest 1% % % Obs 25 25% Sum of Wgt 25 50% Mean Largest Std Dev % % Variance % Skewness % Kurtosis Sample Pseudo R2 LR chi2 p>chi Unmatched Matched /*We see that matching significantly decreased unbalance in the samples*/ /*THIS IS THE ACTUAL ESTIMATION OF THE ATT*/ psmatch2 program, out(farming) pscore(pscore) kernel common bwidth(08) Variable Sample Treated Controls Difference SE T-stat farming Unmatched ATT Note: SE does not take into account that the propensity score is estimated psmatch2: psmatch2: Common Treatment support assignment Off suppo On suppor Total Untreated Treated Total 88 1,152 1,240 keep if _support==1 (149 observations deleted) /*WE NEED TO DO A BOOTSTRAP TO OBTAIN CORRECT STANDARD ERRORS*/

7 7 bootstrap r(att), reps(150): psmatch2 program, kernel pscore(pscore) out(farming) bwidth(08) (running psmatch2 on estimation sample) Note: SE does not take into account that the propensity score is estimated Bootstrap replications (150) Bootstrap results Number of obs = 1152 Replications = 150 command: psmatch2 program, kernel pscore(pscore) out(farming) bwidth(08) _bs_1: r(att) Observed Bootstrap Normal-based Coef Std Err z P> z [95% Conf Interval] _bs_ *** In conclusion, our ATT estimate of the impact of the program is an increase of 14 percentage points This is down from 19 percentage points (our "naive" estimate)