This page is optimized for a taller screen. Please rotate your device or increase the size of your browser window.

Yes, You Can Generalize from Experiments!

August 31, 2017

One of the main criticisms of experimental evaluations — where research units, such as people, schools, classrooms, and neighborhoods are randomly assigned to a program or to a control group — is that they are conducted under such special circumstances that their results may not apply to other people, places, or times. In evaluation parlance, experiments are criticized for having limited “external validity.”
This “problem” with external validity has two main features:

  • Because experiments have a hard time recruiting sites to participate in these types of studies, the sites that do participate differ from those that do not; and
  • Individuals that participate in experimental evaluations do not look like the broader population(s) of interest. This is generally a consequence of having selected non-representative sites.

These issues can be addressed through either design or analysis: Researchers can design and implement experiments to ensure that they take place in representative settings or researchers can adjust an experiment’s results so that they reflect a broader population of interest.

Improving External Validity through Design

Experiments have been successfully conducted in nationally representative sets of sites. Perhaps the two most notable of these are the National Job Corps Study and the Head Start Impact Study. Of course, a random selection of sites is the most straightforward way to ensure representativeness and the generalizability of an evaluation’s results. But there are other ways to ensure that the sites selected to be part of an experimental evaluation represent the sampling frame of interest.
Recent research provides more flexible alternatives to taking a random sample of sites that allow for administrative realities while ensuring the generalizability of study findings. Beth Tipton’s proposed approach involves grouping like sites, ranking them, and deliberately choosing them to mirror the population of interest. Under this approach, sites that decline to participate in the study can be replaced with the next site in the ranking, preserving the representativeness and making the approach practical and desirable for evaluation.

Improving External Validity through Analysis

Established recommendations exist for adjusting the results from a non-representative experimental evaluations so that they can be generalized to a population of interest. Generally, these involve:

What’s Next for the External Validity of Experiments?

I am excited about the next wave of experimental evaluations. Previous blog posts have discussed other ways in which experiments are increasingly providing better information about program effectiveness to practitioners and policymakers.
I am confident that the improvements to experimental designs in practice will help future evaluations provide more generalizable results. And, when implementing these design advancements is not possible, then analytic advancements can still help researchers, program administrators and policymakers extend the results from experiments to other people, places, and times.

Read more about these evaluation issues:

Work With Us
Ready to change people's lives? We want to hear from you.
We do more than solve the challenges our clients have today. We collaborate to solve the challenges of tomorrow.