The policy research world has generally relied on traditional forms of analysis to pore through numerical data (spreadsheets) or text data (journal articles). We run descriptive statistics or regressions for numerical data and read and hand-code text data. Beyond the policy research world, however, others use exponentially larger and more complex data sources and have developed and implemented innovative techniques, including artificial intelligence (AI), to help analyze “big” data. Now it’s time for the policy world to take advantage of these analytical advances.
There are two ways for us to incorporate advances in AI right now, even if our data are not necessarily large. One is the application of natural language processing, a form of AI that enables rapid and comprehensive analysis of text information to determine the main themes and categorize items according to those themes. We now can push the boundaries of analysis beyond traditional text research materials to sources of massive amounts of text such as entire websites and large pools of social media posts.
The other application is predictive analytics, in particular applying it to administrative data collected through state or national organizations and networks. Many of our current data sources are already in formats that would enable the prediction of important individual-level outcomes, even years from when we first collect baseline data. For example, Abt has developed algorithms to predict college graduation outcomes within four years of entry, and this methodology can help predict employment. Other work focuses on using AI and 300 beneficiary characteristics to predict survival among Medicare beneficiaries at the end of life.
Our industry’s culture will have no choice but to adapt to AI as our data sources become larger and richer. That means continuing to train algorithms for both text and numerical data sources so that they become more reliable and viable for implementation.
What will the future look like? Within two or three years, images and videos and their enormous amounts of data will begin to play a larger role in our industry. Abt is beginning to use computer vision, another form of AI, with smartphone images to identify and count the eggs of malaria-carrying mosquitoes. This can significantly reduce the time that field workers currently devote to identify eggs by hand. We can use geospatial data to help predict the onset of natural disasters. That’s akin to using images of skin moles and lungs to predict cancer, something the medical industry is already beginning to do with great accuracy. In our policy research world, we can begin to imagine incorporating other visual sources of data such as road and traffic maps into research on how individuals receive medical services or how they use transportation to go to work and perform activities of daily living.
If other industries are any indicator of how to use data for analysis, our policy research industry should prepare for a world of large and rich data sources from what we currently consider non-traditional areas. Data from the internet or gathered via images and videos may become regular sources for research and applications for our clients. I doubt we’ll completely abandon our traditional analytical tools. But with data easier and cheaper to collect and analyze, we can combine, say, AI and regression analysis and get the best of both worlds. The result: better data and better decision-making.