Technical note The treatment effect: Comparing the ESR and PSM methods with an artificial example By: Araar, A.: April 2015:

Similar documents
EFA in a CFA Framework

ECON Introductory Econometrics Seminar 6

This is a quick-and-dirty example for some syntax and output from pscore and psmatch2.

The Effect of Prepayment on Energy Use. By: Michael Ozog, Ph.D. Integral Analytics, Inc.

The Multivariate Dustbin

Week 10: Heteroskedasticity

Introduction of STATA

Correlated Random Effects Panel Data Models

Stata notes: cd c:\copen\lab2\ Then submit the program (saved as c:\copen\lab2\povline.do) by typing. do povline

The Effects of nonfarm activities on farm households food consumption in rural Cambodia

Timing Production Runs

SHIPPING COST, SPOT RATE AND EBAY AUCTIONS: A STRUCTURED EQUATION MODELING APPROACH

Interpreting and Visualizing Regression models with Stata Margins and Marginsplot. Boriana Pratt May 2017

Practical Regression: Fixed Effects Models

Compartmental Pharmacokinetic Analysis. Dr Julie Simpson

Tutorial Segmentation and Classification

Survey commands in STATA

Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS

The Multivariate Regression Model

COMMERCIALIZATION EFFECTS ON HOUSEHOLD INCOME, POVERTY, AND DIVERSIFICATION: A COUNTERFACTUAL ANALYSIS OF MAIZE FARMERS IN KENYA

How to Get More Value from Your Survey Data

Harbingers of Failure: Online Appendix

Kuhn-Tucker Estimation of Recreation Demand A Study of Temporal Stability

Innovation and productivity across four European countries 1

MEASURING HUMAN CAPITAL AND ITS EFFECTS ON WAGE GROWTH

Supermarkets, farm household income and poverty: Insights from Kenya

A study of cartel stability: the Joint Executive Committee, Paper by: Robert H. Porter

Online Appendix Stuck in the Adoption Funnel: The Effect of Interruptions in the Adoption Process on Usage

for var trstprl trstlgl trstplc trstplt trstep: reg X trust10 stfeco yrbrn hinctnt edulvl pltcare polint wrkprty

Wage Mobility within and between Jobs

Innovation Determinants over Industry Life Cycle

Can improved food legume varieties increase technical efficiency in crop production?

The Impact of Firm s R&D Strategy on Profit and Productivity

Assessing the Macroeconomic Effects of Competition Policy - the Impact on Economic Growth

UPDATE OF THE NEAC MODAL-SPLIT MODEL Leest, E.E.G.A. van der Duijnisveld, M.A.G. Hilferink, P.B.D. NEA Transport research and training

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

The Reallocation of Agricultural Labour across Sectors:

Employment, innovation, and productivity: evidence from Italian microdata

Foreign-owned Plants and Job Security

Technology and incentive regulation in the Italian motorways industry

Getting Started with HLM 5. For Windows

Impacts of Farmer Inputs Support Program on Beneficiaries in Gwembe District of Zambia

Do Fertilizer Subsidies Affect the Demand for Commercial Fertilizer? An Example from Malawi.

On-the-Job Search and Wage Dispersion: New Evidence from Time Use Data

DIFFERENT ESTIMATION STRATEGIES FOR THE COMMON FACTOR MODEL

ROBUST ESTIMATION OF STANDARD ERRORS

Examination of Cross Validation techniques and the biases they reduce.

Impacts of Climate Change and Extreme Weather on U.S. Agricultural Productivity

Technological Change, Trade in Intermediates and the Joint Impact on Productivity

Higher Education and Economic Development in the Balkan Countries: A Panel Data Analysis

IMPACT OF CONSERVATION FARMING ON SMALLHOLDER FARM HOUSEHOLD INCOMES IN ZAMBIA: EVIDENCE USING AN ENDOGENOUS SWITCHING REGRESSION MODEL

An Econometric Assessment of Electricity Demand in the United States Using Panel Data and the Impact of Retail Competition on Prices

WRITTEN PRELIMINARY Ph.D. EXAMINATION. Department of Applied Economics. University of Minnesota. June 16, 2014 MANAGERIAL, FINANCIAL, MARKETING

Tagging and Targeting of Energy Efficiency Subsidies

Assessing Thai Entrepreneur Performance: Steps Toward a New Venture Production Function

STATISTICS PART Instructor: Dr. Samir Safi Name:

Benefit transfers as a field experiment in Poland the case of forest use values

Getting Started With PROC LOGISTIC

Labor Productivity, Product Innovation and Firm-level Price Growth

Trade Liberalization and Inequality: a Dynamic Model with Firm and Worker Heterogeneity

Using Weights in the Analysis of Survey Data

Dynamics of Consumer Demand for New Durable Goods

Identifying the effects of market imperfections for a scale biased agricultural technology: Tractors in Nigeria

QUANTIFICATION OF HUMAN ERROR RATES USING A SLIM-BASED APPROACH

Measuring Household Resilience to Food Insecurity

*STATA.OUTPUT -- Chapter 13

Conservation Payments, Liquidity Constraints and Off-Farm Labor: Impact of the Grain for Green Program on Rural Households in China

IMPACT OF SELF HELP GROUP IN ECONOMIC DEVELOPMENT OF RURAL WOMEN WITH REFERENCE TO DURG DISTRICT OF CHHATTISGARH

Dynamic Probit models for panel data: A comparison of three methods of estimation

Unit 6: Simple Linear Regression Lecture 2: Outliers and inference

Wage and Productivity Dispersion in U.S. Manufacturing: The Role of Computer Investment

The evaluation of the full-factorial attraction model performance in brand market share estimation

A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design

Rice or riots: On food production and conflict severity across India

ARIMA LAB ECONOMIC TIME SERIES MODELING FORECAST Swedish Private Consumption version 1.1

Final Exam Spring Bread-and-Butter Edition

Propensity Scores for Multiple Treatments

ESTIMATING GENDER DIFFERENCES IN AGRICULTURAL PRODUCTIVITY: BIASES DUE TO OMISSION OF GENDER-INFLUENCED VARIABLES AND ENDOGENEITY OF REGRESSORS

Determinants and Impacts of Modern Agricultural Technology Adoption in West Wollega: The Case of Gulliso District

Groupe de Recherche en Économie et Développement International. Cahier de recherche / Working Paper A Model of Horizontal Inequality

The Application of Quasi- Experimental Methods in the Context of Marketing: A Literature Review

Module 7: Multilevel Models for Binary Responses. Practical. Introduction to the Bangladesh Demographic and Health Survey 2004 Dataset.

Full-time schooling, part-time schooling, and wages: returns and risks in Portugal

With a Little Help from My Friends? Quality of Social Networks, Job Finding Rates and Job Match Quality

Labor Economics. Evidence on Efficiency Wages. Sébastien Roux. ENSAE-Labor Economics. April 4th 2014

Decentralization and Pollution Spillovers. from the Re-drawing of County Borders in Brazil.

The Economic and Social Review, Vol. 33, No. 1, Spring, 2002, pp Portfolio Effects and Firm Size Distribution: Carbonated Soft Drinks*

Gasoline Consumption Analysis

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

Collective Bargaining and Faculty Job Satisfaction

SDMXUSE MODULE TO IMPORT DATA FROM STATISTICAL AGENCIES USING THE SDMX STANDARD 2016 LONDON STATA USERS GROUP MEETING. Sébastien Fontenay

CHAPTER 4 EXAMPLES: EXPLORATORY FACTOR ANALYSIS

SDT 366. PRODUCT MIX CHANGES AND PERFORMANCE IN CHILEAN PLANTS Autores: Roberto Alvarez, Claudio Bravo-Ortega y Lucas Navarro

Full-time schooling, part-time schooling, and wages: returns and risks in Portugal

Title. Syntax. Description. xt Introduction to xt commands

Hierarchical Linear Modeling: A Primer 1 (Measures Within People) R. C. Gardner Department of Psychology

No Bernd Hayo and Matthias Neuenkirch. Bank of Canada Communication, Media Coverage, and Financial Market Reactions

Three Dimensional Interpretations of the Korean Housing Market: Structural Relationships among Sales, Chonsei, and Monthly Rent Markets

Lees J.A., Vehkala M. et al., 2016 In Review

Telecommunications Churn Analysis Using Cox Regression

Transcription:

Technical note The treatment effect: Comparing the ESR and PSM methods with an artificial example By: Araar, A.: April 2015: In this brief note, we propose to use an artificial example data in order to compare the results of PSM and those of the ESR model. We review also the theoretical framework to assess accurately the average treatment effects with the ESR model. Based on this, we show the subtle error in the theoretical framework of the paper of Sajaia and Luskin (2004) and then in their mspredict post command. In addition to the corrected mspredict post command, a new movestay post command is produced (msat). This new post command can be used to estimate the and ATE (see also Fuglie, K. O and D. J. Bosch (1995) for the theoretical framework). For more details, see the part B of this note. Fuglie, K. O and D. J. Bosch (1995). Implications of soil nitrogen testing: a switching regression analysis. American Journal of Agricultural Economics Vol.77: 891 900. PART A: The artificial example We assume that the number of observations is 1000: set more off clear all set seed 1234 set obs 1000 Also, we assume that we have three regions: gen region = 1 in 1/300 replace region = 2 in 301/600 replace region = 3 in 601/1000 It is assumed that the first region has more working age population: gen age = min(int(runiform()*65+15), 65) replace age = age+5 if region==1 gen educ = min(int(runiform()*5+1), 6) It is assumed that the program is not randomly attributed and the population in region_1 have more probability to be selected. Also, it is assumed that the selection depends partially on the age: set seed 7421 gen treatment=3*runiform()*(region==1)+0.5*runiform()*(region==2)+0.5*runiform()*(region==3)+(0.2+0.8*runiform())*(age> 30) replace treatment = treatment > 1 local a = 0.6 local b = 0.1 gen e= `a'*runiform() sum e if treatment ==1 qui replace e = e r(mean) if treatment ==1 sum e if treatment ==0 qui replace e = e r(mean) if treatment ==0 The outcome (income) depends on education, age, and the treatment. The parameter a enables to control for the predictive power of the two outcome models with the ESR method. The higher is a, the lower is the predictive power of the model. The parameter b enables to control for the contribution of the variable endogeneity (age). The higher is b, the higher is the endogeneity. In this artificial example, we know the exact value of the effect of the program, which is equal to 2: local at = 2

gen income = 60+0.5*educ+`b'*age+`at'*treatment + e It is assumed that the variable age is not observed, but it affects jointly the program selection and the outcome. This raises the endogeneity problem, and we will need to use or to construct an instrumental variable (inst). The latter is assumed to be not explained by the outcome: gen income0 =income `at'*treatment gen ins = (0.5+uniform())*age regress ins income0 predict inst, res At this stage, we can estimate the effects with the PSM and ESR methods: gen pw=1 xi: psmatch2 treatment i.region ins, outcome(income) cal(0.1) pw(pw) ate local att_psm = r(att) local atu_psm = r(atu) gen lincome = log(income) set seed 5241 xi: movestay lincome educ, select(treatment i.region ins ) msat treatment, expand(yes) PSM ESR 2.212559 2.1918372 ATU 2.5462603 2.4961996 Based on the results above, the first conclusion is that the two models succeed to well capture the effect when the predictive power of the outcome models is high (lower level of a). Note that with the low endogeneity, the ESR turns to be an exogenous switching model, but the structure of the model continues to capture accurately the effect of the program. Now, we would like to do more tests and check how the results are affected by the level of endogeneity (parameter b) where the latter varies between 0.1 and 0.1. For this end, we select a moderate level of the parameter a (a=0.6). 1.6 1.8 2 2.2 2.4 -.1 -.05 0.05.1 Intensity of endogeneity: parameter (b) : PSM : ESR

Better than the PSM, the ESR model seems to be helpful in presence of endogeneity and where the CIA PSM condition becomes less checked. To checked sensitivity of results with predictive power of the ESR outcome models (inversely linked with the parameter a), we show the results according to a when b is fixed to 0.1. The two models succeed in estimating the affect, but the ESR shows a better performance. 2 2.2 2.4 2.6 2.8 3 0 2 4 6 8 10 Predictive power: parameter (a) : PSM : ESR Now, we control the parameters a and b, (a=0.6 and b=0.1) and we vary the predefined (see the command to generate the income) -10-5 0 5 10-10 -5 0 5 10 Constant : PSM : ESR

In the previous examples, we assume a homogenous treatment effect and even a reduced models can be used to estimate the impact. Now we assume that the treatment effect depends on the observed covariates (and no the observed covariate age: at=10+v*educt 0.01*age) and varies between 0.1 and 0.3 (also we have that: (a=0.6 and b=0.1)). 1.5 2 2.5 3 3.5 1.5 2 2.5 3 3.5 Hytherogenous Effect : PSM : ESR The PSM which is not conceived to treat the endogeneity problem is more biased in the case of heterogeneity through the unobservable part.

PART B: The ESR model and the estimation of average treatment effects Mainly, we assume a switching equation sorts individuals over two different states. With the Endogenous Switching Regression model, the ESR we assume that the observable outcome continue. Precisely, we have a model in which Consider the behavior of an agent with two binary outcome equations (participate to the program or not) and a criterion function T i that determines which regime the agent faces (with migrant / without migrant). T i can be interpreted as a treatment. =1 if 0 =0 if 0 (1) Regime1 : 1 1 1 : 1 and 1 =I 1 0 (2) Regime0 : : and 0 =I 0 0 (3) Where Are and are the two latent variables. We assume that the three residual:, et are jointly normally distributed, with a mean zero vector and correlation matrix,, Ω,, Where, 0,1. Since and are not observed simultaneously, the joint distribution of (, ) cannot be identified. In this estimation, we assume that, 1. The estimation is done by the Full specification of Maximum Likelihood model. This model enables also to estimate the treatment effect on treated and untreated. The log likelihood function is defined as follows: ln ln ln / / 1 ln ln / / Where. and. are respectively the density and cumulative density functions. is an optional. Also, we have that: / 1 1,2 Where is the coefficient of correlation between and.

The results of ESR can also be used to generate conditional expectations which will provide a concise measure of any efficiency differences among firms based on the credit market outcome. The following expressions are considered: 1, / (10) 0, /1 (11) 1, / (12) 0, /1 (13) As we can observe, the sign that precedes and is corrected compared to what was reported in Sajaia and Luskin (2004) movestay command as well as their related Stata paper. The subtle error comes from omitting the negative sign for the de definition of the Mills ratio for the non participants group (i.e. : 2 1 /1 ). Thus, even if the results of movestay Stata command are accurate, some of those of the mspredict are wrong. At this stage, I have corrected temporarily this post command (mspredict_ar.ado) and I will contact the movestay authors. I addition, I have programmed for the PEP teams the post command msat to estimate the treatment effects. syntax : msat varlist(min=1 max1), [hhsize(varname) expand(string)] The post command msat enables to estimate quickly the average treatment affects after the movestay command. The varname that follows this command is the dummy variable of the treatment. The estimation takes automatically into account the sampling weight. Options: hsize expand Household size. For example, to compute poverty at the individual level, one will want to weight household level observations by household size (in addition to sampling weights, best set in survey design). If we use the log of the outcome variable with the movestay command and we like to estimate the treatment effect on the outcome (not on the log of outcome), the user can add the option: expand(yes). Example: Estimated treatment effects based on the endogenous switching regression model Index Estimate Std. Err. t P> t [95% Conf. Interval] 2.192045.0704571 31.1118 0.0000 2.053784 2.330306 ATU 2.496426.0627154 39.8056 0.0000 2.373357 2.619495 ATE 2.374978.0472344 50.2807 0.0000 2.282288 2.467668 Note: The estimated standard errors ommit the part of samplind errors related to the estimation of the Beta's coefficients with the ESR model. where lincome is the log of the income (the outcome in this example). How to add the new post commands? Copy simply the mspredict_ar.ado and msat.ado files in c:/ado/plus/m/