Late last week, two colleagues at the Urban Institute, Will Schupmann and Matthew Eldridge, published a blog post for Urban Wire with a simple title and a complex subject: ‘When does a social program need an impact evaluation?’ Schupmann and Eldridge argued convincingly that social programs may benefit from impact evaluations even when their benefits appear “so intuitive and … so compelling” that sophisticated impact evaluations seem rather unnecessary.
I agree. But because Schupmann and Eldridge’s post does not directly address the (excellent) question they pose in their title, and instead elaborates on one specific instance in which an impact evaluation might be appropriate for a social program, I thought I would take this opportunity to chime in on the discussion and directly answer the question they asked.
The answer is: A social program needs an impact evaluation when the program is ready for one.
On the face of it, that seems fairly straightforward. It would of course be nonsensical to suggest that a social program needs an impact evaluation when it is not ready for an evaluation—and yet we have seen, over and over again, social programs all too ready to dive into the deep waters of evaluation without being entirely sure yet of whether or not they can swim. And of course one null or negative result from a rigorous, sophisticated impact evaluation can set a program back years or even decades with funders, policymakers, and the general public. And so, to repeat: A social program needs an impact evaluation when the program is ready for one.
Look Before You Leap
My Abt colleague Jacob Klerman has written extensively (here, and also with Diana Epstein in Laura Peck’s recent ‘Social Experiments in Practice’) about the critical role formative and process evaluations can play in moving the research ‘tollgate’ forward, allowing programs to strengthen logic models and effectively implement all program elements before finally proceeding to the crucial—and expensive!—step of conducting an impact evaluation. In Jacob’s words, proceeding in this manner “screens out programs that—according to their own logic model—are unlikely to show impacts” and saves impact evaluations only for programs which have demonstrated, through formative and process evaluations, that they are not unlikely to have an impact.
Jumping to an impact evaluation too early can cause programs that might otherwise have been found to be successful in achieving their aims to question their practice (or lose funding, in the wake of null or negative results) when all that might have been necessary to find positive impact was a little more time to strengthen and clarify program elements ante impact evaluation. Impact evaluations should examine programs as they are, of course, not as they dream of being. But dreams take time to achieve, and asking how a program measures up to its highest aspirations when it’s only just started down the road to getting there doesn’t help anyone.
And so here we are, back at the central point: That a social program needs an impact evaluation when the program is ready for one. This is, of course, a necessary but not sufficient condition. Some social programs (like seatbelt laws intended to prevent motor fatalities or restrictions on tobacco use from a public health standpoint) have such clear correlations to positive outcomes that we probably don’t need an impact evaluation to determine causality. There is, of course, some point in studying such subjects to find different degrees of benefit to different classes of participants. This was a point Schupmann and Eldridge made in their piece. But the fact is some things are just plain so, and don’t need further testing.
But some programs do. And if they’re truly ready for an impact evaluation (Laura has written a much longer checklist for such programs here, and discussed the subject at length in her recent book, cited above) then you get to the fun part, at least for evaluation nerds like us: Deciding how to design the evaluation, what outcomes to test, running power calculations, and making all the other key decisions that make this line of work interesting and exciting. And from that point on it’s off to the races, as we evaluators try to, in Jacob’s words, identify the programs that work.