How to Randomize? Dean Karlan Yale University. povertyactionlab.org

How to Randomize? Dean Karlan Yale University

Outline of Today s Lectures This lecture Overview of different ways to randomize Convincing/How to gather support for evaluation Typical Plan

Good Evaluation Not just How did we do? Also How should we do it? Key component: Identify implementers questions Answer them Make the research a win-win for operations

Different ways to randomize

Randomization Designs Lottery design (transparent or not) Phase-in design Encouragement design Multiple treatments Rotation design (e.g., balsakhi case) Note: These are not mutually exclusive.

Simple Randomization Think of a clinical trial Take 1000 people and give half of them the drug Can we simply apply this idea to social programs?

What are the constraints when we randomize in social situations?

Key constraints Cannot interfere significantly with program operations Above clinical trial model rarely fits NGOs Cannot be perceived as unfair Must be politically feasible Must be ethical

Limited Resources Many programs have limited resources Many more eligible recipients than resources will allow service for Examples: Training for entrepreneurs or farmers Program for X schools or clinics How to allocate these resources?

Lottery design Randomly choose from applicant pool In this design participants are explicitly told that winners and losers will end up with different outcomes Use when there is no good reason to choose among a subset of applicants Lottery is perceived as fair Need not be transparent (credit scoring) Often politically more feasible

What are other types of lotteries?

Lottery Design Built in and optimizing operations Credit scoring Lottery need not be over individuals Can be over groups Which micro-credit groups? Can be over communities Which villages should we enter? Can be over schools Balsakhi

Ethicality and Political Feasibility Transparency often helps, sometimes does not Can we give something to the control sites? Sometimes test A versus B, not A versus nothing.

What to do if program has 500 applicants for roughly 500 slots? Or less?

A less obvious lottery design One trick: Advertise program and greatly increase applicants Now lottery is feasible Is this ethical? Would it be ethical if we added slots to the program and only gave these to the new applicants? Or desirable from the evaluation point of view?

When is a lottery impossible?

Where a lottery fails Suppose there are 2000 applicants. Screening of applications/need produces 500 worthy candidates There are 500 slots Lottery infeasible

What can be done in this situation?

Potential Solutions Screening Rule What are they screening for? What elements are essential? What elements are arbitrary? Example: Training program 2000 candidates 1000 fit criteria such as poor enough to be worthy, qualified enough to take advantage of training 250 chosen of out of the 1000 based on particular attributes NGO willing to allocate remaining 250 by lottery Lesson: Many parts of a selection rule are designed simply to weed out candidates

Different kind of randomized design

Another point of interaction Programs have lives They have initial stages They expand All these are good points to think about randomization

Phase-in Design In this design everyone is told that they will end up with the same outcome but some later than others. Use the fact that the program going to expand Example: In 5 years, 500 schools will be covered. In 5 years, 250 districts will receive land tenure What determines which schools or districts will be covered in which year? Some choices may be based on need, potential impact, etc. Some choices largely arbitrary We can therefore choose who gets program early at random Do this on population about which choices are arbitrary

Other designs

Encouragement design In this design everyone is immediately eligible to receive the program there is enough funding to cover all of them. However not all of them will necessarily take it up Pick some people at random and encourage them to use program Use non-encouraged as control Ethical? What population is this a treatment effect on?

Encouragement What is an encouragement? Something that makes some folks (treatment group) more likely to join than others (control group) Key criteria: Should not itself be a treatment Bad idea: intense entrepreneurship training program, as encouragement for credit Good idea: marketing to some but not everyone; make sure the marketing isn t overly informative Think about who responds to the encouragement. Are they different? Bad idea: party with alcohol

Encouragement Design: Why Does this Work? First, an extreme example: perfect takeup of those treated, no takeup of those not treated: Encouraged Treatment Not Encouraged Control Treated 100% 0% Untreated 0% 100% Let treatment (encouraged) earn 1 standard deviation more, and the control (unencouraged) remain unchanged. Note: Same difference exists between the treated and untreated. The impact of the intervention is 1 standard deviation higher income.

Encouragement Design: Why Does this Work? Now, add non-takeup in the encouraged group Encouraged Not Encouraged Treatment Control Treated 50% 0% Untreated 50% 100% Now, using same true effect size (1 sd), the average effect for the treatment (encouraged) is 0.5 standard deviations (called Intent to Treat, ITT, effect) Then divide the ITT by the differential proportion treated (0.5): the Treatment on the Treated effect, TOT, is 0.5/0.5 = 1 standard deviation.

Encouragement Design: Why Does this Work? Last, add takeup in the not encouraged group Encouraged Not Encouraged Treatment Control Treated 50% 25% Untreated 50% 75% Now, using same true effect size (1 sd), the average effect for the treatment is 0.5 standard deviations more, but the control group now improved by 0.25 standard deviations. The ITT effect is thus the difference: 0.25 standard deviations. Then divide the ITT by the differential proportion treated (0.25): the Treatment on the Treated effect, TOT, is 0.25/0.25 = 1 standard deviation.

Multiple Treatments Sometimes you are not sure what intervention to implement You can then randomize different interventions with different (randomly chosen) populations Does this teach us about the benefit of any one intervention? Advantage: win-win for operations, can help answer questions for them, beyond simple impact!

Convincing But I already know the answer (Between the lines): But I do not want to risk learning that we do not have impact. Finding the right & willing partner

Convincing Practitioners Three critical items: Listen. Find ways to answer their operational questions while doing the research. Know the identity of key players and their perspectives. Culture important Sometimes difficult to disagree in person. Is there true buy-in?

Convincing Practitioners Typical specific concerns: Gossip Fairness Cost Human resource policies Length of study Politics Experimenting ethics

Convincing Practitioners Typical concerns: Gossip (contamination vs spillovers) Will be discussed in more detail on Day 4. In some settings, contamination is good. If the intervention is about information (e.g., allocation of school resources, school performance information, education on child nutrition or breastfeeding, deworming etc.), then you want gossip. But you want to setup the experiment so that you can measure the total impact on those treated, both directly by you and indirectly by the gossip.

Convincing Practitioners Fairness Rare is the case of unlimited resources. Given unlimited resources, fairness is entirely in the framing: Provide everyone a 50% probability of receiving treatment Provide those who show up first a 100% probability of receiving treatment, and those who do not a 0% chance. Randomization allows one to learn more about the ideal participants. If a non-random targeting method is used, one might be making a mistake hence not maximizing impact.

Convincing Practitioners Typical concerns: Cost Often cheaper than non-experimental evaluations Certainly cheaper when benefits includes. Which costs more for survey work? Which has more downside risk in terms of yielding imprecise or biased results? Key question is not ex-post versus randomized trial, but evaluation versus no evaluation.» When to evaluate?

Convincing Practitioners Typical concerns: Human resource policies HR policies often arise if salary is conditional on performance. Then, if the treatment influences performance, problems arise. Or, the treatment could present an alternative that threatens their job! Kenya: resentment of government teachers, afraid the contract teachers will do well and that they will then lose their job. Microcredit: incentive pay quite typical (interesting in its own right!) Calibrate incentive system so that mean compensation is the same. If employees perceive that the treatment is good (or bad) for them, they may put forth extra effort to see it work (or not). Monitoring employee behavior can provide information on whether this is happening.

Convincing Practitioners Typical concerns: Length of study Politics: Not always possible. How to build rural roads and avoid corruption? Monitoring Corruption project (Olken, 2005) Prob of audit from 4% to 100% Theft falls by 8% of expenditures

Convincing Practitioners Typical concerns: Experimenting ethics: Wrong to use people as guinea pigs. If you think it works, then it is wrong to not treat everyone. Answers: Why are prescription drugs different? All limited initiatives are in some way an experiment, the difference here is that we are controlling the experiment in order to learn. Limited resources. Allocation has to happen somehow. Why not learn while we are doing it? Produce a public good for future generations.

Randomization Timeline The following slides are auxiliary slides to detail the typical timeline of a randomized controlled trial.

Randomization Timeline Planning 1. Identify problem and proposed solution 2. Identify key players 3. Identify key operations questions to include in study 4. Design randomization strategy 5. Define data collection plan Pilot Implementation 1. Identify target population (sample frame), and baseline data if applicable 2. Randomize 3. Implement intervention to treatment group 4. Measure outcomes

Randomization Timeline: Planning 1. Identify problem and proposed solution Define the problem (which should lead to the key hypotheses) Define the intervention (sometimes already established) Learn key hurdles in design of operations (maybe opportunity to help answer them). 2. Identify key players Top management Field staff Donors

Planning 3. Identify key operations questions to include in study Find win-win opportunities for operations How to best market? How to sustain the program? Pricing policy Generating demand through spillovers Types or extent of training?

Randomization Timeline: Planning 4. Design randomization strategy (more on this later in this lecture) Basic strategy Sample frame Unit of randomization Stratification

Randomization Timeline: Planning 5. Define data collection plan Key point: Baseline data is NOT necessary. It can be nice, but is not necessary. More on this tomorrow

Randomization Timeline: Pilot Pilots vary in size & rigor Pilots & qualitative steps are important. (Why?) Sometimes a pilot is the evaluation Other times they are pilots for the evaluation

Randomization Timeline: Implementation 1. Identify target individuals or villages and collect baseline information (if applicable) 2. Randomize 3. Implement intervention to treatment group 4. Measure outcomes

Randomization Timeline: Implementation 1. Identify target individuals and collect baseline data 2. Randomize Real-time randomization e.g. credit scoring All-at-once randomization All villages known up front Waves Want to learn from one wave to the next Expansion plans not well defined throughout country. One district at a time. 3. Implement intervention to treatment group Internal controls critical nothing worse than doing all this work, but not having control on the field! 4. Measure impact after necessary delay to allow impact to occur Common question: How long should we wait? Operational considerations must be traded off. No one-size-fits-all answer. Want to wait long enough to make sure the impacts materialize.

Poverty Action Lab: Translating Research into Action MIT