The ultimate goal of policy analysis is to identify programs that work. Policymakers need to know: Does this program work? For whom does this program work? When does this program work? And would some variant of this program work better? To answer these questions, we need estimates of program impact; i.e., outcomes with the program relative to what outcomes would have been without the program. The “gold standard” approach to estimating impact is random assignment, but other methods are often appropriate (see here for a primer on random assignment).
Figure 1 depicts the implied evaluation process—a program idea is created, the idea gets implemented as a program, and the process leaps forward to a rigorous impact evaluation. Noticeably absent from this evaluation process are formative and process evaluations. Formative evaluations attempt to improve how programs operate. Process evaluations attempt to describe how programs operate (i.e., what they do) and may also document short-term outcomes without explicitly considering what would have happened in the absence of the program.
But if the key policy questions concern impact, should we do formative and process evaluations? They do not tell us anything about impact. And, if we should do them, why?
The Role of Formative and Process Evaluations
As to the first question—Should we do them?—definitely yes!
As to the second question—Why? Well that’s more subtle.
All programs have logic models—sometimes explicit, often implicit. A program’s logic model explains how the program is expected to work. The early stages of the logic model—inputs, outputs, sometimes short-term outcomes—have implications for understanding how the program operates and what happens to participants during and shortly after the program. Crucially, checking these implications does not require approximating what would have happened without the program.
A formative evaluation helps a program to tweak its operations and strengthen the links in the logic model. For example, if a program is having trouble acquiring or retaining staff, space, and partnerships, a formative evaluation might surface these challenges and help improve implementation. If a program is acquiring staff, space, and partnerships, but teachers are not teaching with fidelity or students are not attending (enough of) the classes, a formative evaluation might help the program tweak its implementation to achieve those goals.
Finally, a process evaluation can and should verify that each of these early steps is satisfied. The program’s logic model posits that these early steps are necessary in order to have impacts. If the program cannot show that these early steps are occurring successfully for those in the program, then, according to the program’s own logic model, it is unlikely that the program will show impacts. Under those conditions, there is little reason to go on to an impact evaluation—i.e., comparing outcomes for those in the program to an approximation to what outcomes would have been without the program. Without satisfying the early steps, it is unlikely that longer-term impacts would occur.
Process Evaluation as a Toll Gate
Viewing the process evaluation as a tollgate suggests the evaluation process depicted in Figure 2. Specifically, rather than proceeding directly from a program implementation to an impact evaluation, follow this order : (i) pilot the program; (ii) conduct a formative evaluation to improve operations and refine the program’s own logic model; and (iii) conduct a process evaluation to verify that the early stages of the logic model are satisfied. Only proceed to—expensive and long timeline—impact evaluation (with a control/comparison group) if this—relatively inexpensive and short timeline—process evaluation tollgate (without a control/comparison group) is satisfied.
This process evaluation tollgate serves two functions. First, the formative evaluation improves the program. Initial program implementations are rarely perfect. Programs learn—and improve—by doing. A formative evaluation can speed up that process: while an initial implementation might not pass the toll gate, a refined implementation might.
Second, the process evaluation screens out programs that—according to their own logic model—are unlikely to show positive impacts. Impact evaluation is expensive and has long time lines. We should focus the limited resources available for impact evaluation on the programs most likely to succeed—which excludes those that can’t pass the process evaluation tollgate.
Finally and crucially, note that passing the process evaluation toll gate is a necessary, but not sufficient, condition for impacts. Some perfectly implemented programs that have positive outputs and short-term outcomes show no impact on the long-term outcomes we truly care about. Thus, formative evaluation and process evaluation are not substitutes for impact evaluation. From a process evaluation, we can often conclude that a program is unlikely to have impact, but we can never conclude that a program does have impact. Concluding that a program has impact requires an impact evaluation.
For more on these ideas see:
- Epstein, Diana and J.A. Klerman. 2012. “When is a Program Ready for Rigorous Impact Evaluation?” Evaluation Review. 36(5): 373-399.
- Abt Associates Policy Brief: When is a Social Program Ready for Rigorous Impact Evaluation?
- Epstein, Diana and Jacob Klerman. Forthcoming. “On the ‘When of Social Experiments: The Tension Between Program Refinement and Abandonment” in the “Social Experiments in Practice” issue of New Directions for Evaluation.