In early October, we had the pleasure of participating in a meeting
sponsored by the U.S. Department of Health and Human Services entitled “The Promises and Challenges of Administrative Data in Social Policy Research.” Consistent with the title of the meeting, the presentations emphasized both the promise of using administrative data for policy analysis and the real challenges of doing so: getting access to the data, understanding what it means, verifying that it is sufficient for the intended purpose.
Two Groups of Presentations
We were struck by how the presentations fit into two distinct groups. In one group, presenters described how they combined administrative data with strong research designs for causal inference (e.g., formal randomized controlled trials, leveraging of existing lotteries, difference-in-differences) to address specific research questions that had been posed independently of the administrative data system. The presentations demonstrated the promise of combining appropriate administrative data with strong research designs for causal inference to address well-posed policy questions.
In the second group, presenters described their efforts to create new databases combining various administrative data systems and to ease access for researchers to those data. Generally, these efforts were not driven by a particular research question. Instead, the implicit position of the presenters was that having richer and more accessible data would allow researchers to answer a multitude of questions not yet formulated. Although presenters sometimes articulated modest descriptive goals as important uses of these data, they mostly expressed the value of the new administrative data systems in terms of quickly answering questions about program impacts: “Does this program or policy work, and, if so, how well?”
‘Data’ and ‘Design’
From our experience, this second approach raises substantial risks of providing unreliable answers to policymakers’ questions about program impacts. Although these databases contain a large amount of information that was previously unavailable, in the absence of an appropriate design, the following line of argument suggests that policy advice based on weak designs is likely to be badly flawed.
Outcomes for the program participants and contained in these databases are not enough. (Here we use “program” and “participants” generically to represent the variety of interventions and units that are the focus of public policy research.) Impact is the difference between participants’ outcomes with the program and what their outcomes would have been in the absence of the program, i.e., the “counterfactual.” Administrative data measure the outcomes with the program; we still need to approximate outcomes under the counterfactual. Perhaps participant outcomes without the program would have been the same — or better!
We can try to approximate the counterfactual by measuring outcomes of individuals before and after they participated in the program, but is this a good approximation of the counterfactual? Maybe yes, but probably not. What else might have changed over time that could have influenced the outcomes of the program participants? Would we have expected the passage of time, even in the absence of the program, to have improved outcomes? For example, were participants selected because something was wrong that (at least to some extent) might have gotten better on its own?
Often we can try to approximate the counterfactual condition using outcomes for those not in the program, but this approach by itself is also unconvincing. Are the outcomes for those not in the program a good approximation of the outcomes that would have been attained among those in the program if the program had not been available? Maybe yes, but probably not. Is there a reason why the non-participants were not in the program? Are they different systematically from those in the program (but in a way that cannot be measured in the administrative data)? Were they ineligible? Did they know something that made them believe that they were less likely to benefit from the program? Any of these considerations would imply that their outcomes are probably different from those of program participants if the program never existed.
We can augment these two approaches—pre/post and participant/nonparticipants—by trying to adjust for these differences. However, such adjustment requires that the administrative data contain information that accounts for why some people were selected into the program and others were not. Such information is seldom available in administrative records and even more often not observed at all. Only rarely will these efforts yield a proper counterfactual, and without a proper counterfactual, we cannot correctly estimate impact.
Administrative Data Systems Need Pairing with Strong Study Design
Building robust administrative data systems can be used for many valid purposes. Such data systems can be harnessed to provide rich descriptive analyses more quickly and easily than ever before. They can answer many types of questions about the experiences of program participants. And conceivably, they may also be used to produce reliable estimates of impact and thus lead to better policy. However, reliable estimates of impact will result only when those robust administrative data systems are combined with strong designs. Combining strong administrative data with weak design will only lead to incorrect estimates of impact and bad policy. Any move toward using administrative data for estimating impact and informing policy needs to carefully consider design.