World Development symposium on RCTs

World Development has a great collection of short pieces on RCTs.

Here is Martin Ravallion’s submission: 

….practitioners should be aware of the limitations of prioritizing unbiasedness, with RCTs as the a priori tool-of-choice. This is not to question the contributions of the Nobel prize winners. Rather it is a plea for assuring that the “tool-of-choice” should always be the best method for addressing our most pressing knowledge gaps in fighting poverty.

… RCTs are often easier to do with a non-governmental organization (NGO). Academic “randomistas,” looking for local partners, appreciate the attractions of working with a compliant NGO rather than a politically sensitive and demanding government. Thus, the RCT is confined to what NGO’s can do, which is only a subset of what matters to development. Also, the desire to randomize may only allow an unbiased impact estimate for a non-randomly-selected sub-population—the catchment area of the NGO. And the selection process for that sub-sample may be far from clear. Often we do not even know what “universe” is represented by the RCT sample. Again, with heterogeneous impacts, the biased non-RCT may be closer to the truth for the whole population than the RCT, which is (at best) only unbiased for the NGO’s catchment area.

And here is David Mckenzie’s take: 

A key critique of the use of randomized experiments in development economics is that they largely have been used for micro-level interventions that have far less impact on poverty than sustained growth and structural transformation. I make a distinction between two types of policy interventions and the most appropriate research strategy for each. The first are transformative policies like stabilizing monetary policy or moving people from poor to rich countries, which are difficult to do, but where the gains are massive. Here case studies, theoretical introspection, and before-after comparisons will yield “good enough” results. In contrast, there are many policy issues where the choice is far from obvious, and where, even after having experienced the policy, countries or individuals may not know if it has worked. I argue that this second type of policy decision is abundant, and randomized experiments help us to learn from large samples what cannot be simply learnt by doing.

Reasonable people would agree that the question should drive the choice of method, subject to the constraint that we should all strive to stay committed to the important lessons of the credibility revolution.

Beyond the questions about inference, we should also endeavor to address the power imbalances that are part of how we conduct research in low-income states. We want to always increase the likelihood that we will be asking the most important questions in the contexts where we work; and that our findings will be legible to policymakers. Investing in knowing our contexts and the societies we study (and taking people in those societies seriously) is a crucial part of reducing the probability that our research comes off as well-identified instances of navel-gazing.

Finally, what is good for reviewers is seldom useful for policymakers. We could all benefit from a bit more honesty about this fact. Incentives matter.

Read all the excellent submissions to the symposium here.

More on the apparently *transient* effects of unconditional cash transfers

Berk Ozler over at Development Impact has a follow up post on GiveDirectly’s three-year impacts. The post looks at multiple papers analyzing results from the same cash transfer RCT in southwestern Kenya:

First, on the initial studies:

On October, 31, 2015, after the release of the HS (16) working paper in 2013, but before the eventual journal publication of HS (16), Haushofer, Reisinger, and Shapiro released a working paper titled “Your Gain is My Pain.”  In it, they find large negative spillovers on life satisfaction (a component of the psychological wellbeing index reported in HS 16) and smaller, but statistically significant negative spillovers on assets and consumption. The negative spillover effects on life satisfaction, at -0.33 SD and larger than the average benefit on beneficiaries, imply a net decrease in life satisfaction in treated villages. Furthermore, the treatment (ITT) effects are consistent with HS (16), but the spillover effects are not. For example, the spillover effect on the psychological wellbeing index in Table III of HS (16) is approximately +0.1, while Table 1 in HRS (15) implies an average spillover effect of about -0.175 (my calculations: -0.05 * (354/100)). There appear to be similar discrepancies on the spillovers implied for assets and consumption in the HRS (15) paper and HS (16). I am not sure what to make of this, as HRS (15) is an unpublished paper – there must [be] a good explanation that I am missing. Regardless, however, these findings of negative spillovers foreshadow the three-year findings in HS (18), which I discuss next.

Then on the three-year findings:

As I discussed earlier this week, HS (18) find that if they define ITT=T-S, virtually all the effects they found at the 9-month follow-up are still there. However, if ITT is defined in the more standard manner of being across villages, i.e. ITT=T-C, then, there is only an effect on assets and nothing else.

… As you can see, things have now changed: there are spillover effects, so the condition for ITT=T-S being unbiased no longer holds. This is not a condition that you establish once in an earlier follow-up and stick with: it has to hold at every follow-up. Otherwise, you need to use the unbiased estimator defined across villages, ITT=T-C.

To nitpick with the authors here, I don’t buy that [….] lower power is responsible for the finding of no significant treatment effects across villages. Sure, as in HS (16), the standard errors are somewhat larger for across-village estimates than the same within-village estimates. But, the big difference between the short- and the longer-term impacts is the gap between the respective point estimates in HS (18), while they were very stable (due to no/small spillovers) in HS (16). Compare Table 5 in HS (18) with Appendix Table 38 and you will see. The treatment effects disappeared, mainly because the differences between T and C are much smaller now, and even negative, than they were at the nine-month follow-up.

And then this:

If we’re trying to say something about treatment effects, which is what the GiveDirectly blog seems to be trying to do, we already have the estimates we want – unbiased and with decent power: ITT=T-C. HS (18) already established a proper counterfactual in C, so just use that. Doesn’t matter if there are spillovers or not: there are no treatment effects to see here, other than the sole one on assets. Spillover estimation is just playing defense here – a smoke screen for the reader who doesn’t have the time to assess the veracity of the claims about sustained effects.

Chris has a twitter thread on the same questions.

Bottom line: we need more research on UCTs, which GiveDirectly is already doing with a (hopefully) better-implemented really long-term study.

 

 

this seems really cool

An African youth volunteer program just got launched in West Africa. The BBC reports that “The scheme would see youths spend time helping out in areas such as agriculture, health or education in a different country to their own.”

Read more here