Harder, better, faster, stronger - will AI improve public policymaking?
How do we find a middle path between the idealised potential, and messy reality, of government using data and AI to shape public policymaking?
As ever when discussing AI, we face a rhetorical divide.
We are witnessing increasing excitement about AI’s role in ‘revolutionising’ the way public policy is made. Earlier this year, the then Deputy Prime Minister Oliver Dowden described AI as a ‘silver bullet’ which ‘dangles before us the prospect of increased productivity, vast efficiency savings, and improved services.’ Our recent research with public sector leaders on the deployment of foundation models found some optimism for AI’s potential to mobilise evidence, automate processes, and undertake complex analyses in support of policymaking.
Such excitement stands in stark contrast to how existing data-driven insights are used in policymaking. The Public Administration and Constitutional Affairs Committee pointed out that existing government data is withering in silos, and that when announcing major policy Government often ‘chooses to do so either without any underpinning analyses, or with reference to numbers which bear no resemblance to published evidence.’
We need to carve out a middle path between the idealised potential and messy reality of Government uses of data and AI to shape public policymaking. And we need to start with unpicking what we mean by AI.
Definitions
There is no single, agreed-upon definition of ‘AI’. AI can refer to a scientific field of study, a series of products and services, or features of a product or service. Broadly, definitions of AI can refer to the science of creating computer systems capable of performing tasks traditionally requiring human intelligence to complete. This can include teaching machines how to draw inferences from data (machine learning) or hard-programming rules that categorise data (rules-based learning). Some AI systems use a relatively small amount of data to address a single narrow task.
But the kind of AI capturing recent attention is different. Generative AI tools, which use large amounts of data to generate text, audio, or visual outputs from a user’s input, can perform a wider range of tasks. They used probability calculations to create plausible content – mimicking human responses – but don’t understand the material they put out, meaning they can make unpredictable, convincing but problematic errors.
Despite generating new content, these tools are still built with historic and accessible data.
Data deficiencies
Any AI tool is only as good as its data, and novel forms of AI are still haunted by old data problems.
Datasets are partial; public service datasets reflect public service users and processes. Those who face barriers to accessing services might be absent or unrepresented in the data, as will those seeking private sector support. Because of this, datasets are never a true reflection of reality. The data picture can tell distorted stories of public needs, drivers and outcomes.
Even our best datasets are not perfectly curated. During the COVID-19 pandemic, when NHS patient data (considered a dataset ‘crown jewel’ for research and innovation) was analysed to understand risk factors, it found missing ethnicity data in over a quarter of GP records.
With generative AI, there are concerns about data provenance. OpenAI’s GPT-4 (which powers ChatGPT) draws on datasets of social media activity, published books and various other content sources, many of which have been scraped without the rightsholders’ permission. Datasets used by these tools are so large that, according to companies like OpenAI or Google, it can be impossible to determine precisely what data is within them. One major dataset used to train some of the most cutting-edge image-generation tools was found to contain child sexual abuse material and misogynistic and racist content.
AI privileges data that can be quantified and statistically analysed. Even where accurate, legitimate and relevant, turning information into data flattens richness and lived experience. We should be cautious about seeing datasets as proxies for truth and reality, as there are gaps between people’s lived experiences and the data that has been collected. A care organisation collecting data on the number of visits to a vulnerable patient may have little understanding of the dignity, respect and safety felt by that patient.
We risk over-privileging the data-driven methodologies used in AI tools because they are quick and powerful, overlooking the qualitative or experiential insights not meaningfully captured.
How might AI improve public policy making?
The discussion about AI improvements in the public sector usually refers to 2 things:
- AI as ‘better than’: improving the status quo (customising, personalising, triaging, or enabling breakthroughs).
- AI as ‘quicker than’: rendering a process or service more efficient, leading to productivity gains.
AI as better than
Data science and advanced analytics have enhanced the public sector’s ability to sift, match and link vast quantities of data. Without such techniques, manually developing new datasets and gathering insights from ordinary human data would be an insurmountable process.
Where AI is used to build evidence (analysing trends, modelling, forecasting), if accurate, it could better inform policymaking by offering up insight. We shouldn’t, however, assume that a lack of evidence is the only reason for poor policymaking, or even that there is an agreed approach to what ‘good’ policymaking looks like.
On AI being used more instrumentally, complementing existing tasks or performing them ‘better than’ a human, there are only a small number of narrow use cases with evidence that AI can perform better than manual approaches (like detecting anomalies in radiography scans).
With regards to using AI to undertake, contribute to, or augment, complex, contested or broader tasks associated with public policymaking, we have very little evidence. There is no systematic understanding of where AI tools are being used across the public sector (even within central Government), let alone whether they work at improving outcomes.
AI as ‘quicker than’
The successful deployment of AI to smooth out unproductive time and reduce laborious work could make the public sector more efficient. Given the UK’s stubbornly low productivity figures, which have been worrying economists for at least two decades, this would be a big win.
In 2023, the Central Digital and Data Office (CDDO) estimated that almost a third of tasks in the civil service could be automated, although accurate estimates are notoriously difficult to make about a new technology. Current ideas being explored by Government include using AI to draft public communications, offer advice via chatbots, create visualisations or explainers, produce briefing notes, and analyse public consultations. The latter is being piloted by the Incubator for AI, which claims that automation could save most of the £80 million currently spent on consultation analysis.
Replacing existing consultation analysis with this Consult tool would save thousands of hours of work. However, this approach privileges quantitative over qualitative analysis, undermining the potential of finding a ‘golden nugget’ of data: a compelling quote or story that shifts thinking. It would also erode the experience of engaging with consultation material, dulling the cumulative impact of reading people’s stories in their voices. Sometimes ‘friction’ in a system can produce something positive, intimate, or tactile which gets erased through the pursuit of technological efficiency.
Of course, it’s idealised to imagine those designing policy have free reign to be moved by individual responses in consultations. The reality is often a regimented coding process. But automating engaging with the material directly takes away one important aspect of a consultation, to offer publics a chance to be listened to.
No one would argue that consultation fosters a perfect human relationship with a government decision-maker. It is, in part, an imperfect attempt at something relational. We should consider how automating such processes may change the willingness of the public to engage, and how they choose to do so, if they know their responses are to be analysed by a machine. We need to count the value given to existing human-to-human relationships.
Productivity is also not an uncontested ‘good’ in and of itself. While a chatbot guiding someone through a bureaucratic labyrinth may be a form of efficiency that is beneficial for all parties, there are other technologies which could drive productivity but in ways that are intrusive, unacceptable or inhumane – like unacceptable levels of workplace surveillance.
The productivity puzzle
A more fundamental issue (sometimes glossed over by those enthusiastic about headline productivity gains) is that inserting technology into any system changes that system. Technologies don’t simply switch a manual task to an automated one, boosting productivity per civil servant and saving costs. Instead, they create ripple effects - altering people’s behaviours and expectations around each technology. Professionals may defer to a tool, ignore it completely, or learn how to ‘game’ it to get the desired results. The public may lose trust in a system, triggering harms or greater demand falling on other parts of the public sector. AI may well increase productivity, but we must consider its insertion as part of a 3-part puzzle comprising technology, processes and people, not as an isolated, quick-fix.
However, this sociotechnical analysis of AI isn’t always built into predictions of potential productivity gains. The CDDO analysis identifying major AI productivity gains did not examine the feasibility or assess the costs of delivering them. The Tony Blair Institute’s recent pitch that embracing AI could save £40 billion in public sector productivity gains was critiqued for using ChatGPT to categorise existing public sector tasks, predicting if they could be performed by AI, and calculating the cost savings as like for like.
Despite the sense that generative AI ought to boost productivity, the supporting evidence isn’t all there. And some are sounding the alarm on whether the hype will really deliver. As Goldman Sachs’ Head of Global Equity Research points out ‘eighteen months after the introduction of generative AI to the world, not one truly transformative—let alone cost-effective—application has been found.’ Indeed a recent survey of workers and C-suite executives found 77% claim generative AI has added to their workload, leading to increased pressure to work longer hours to be more productive due to using AI.
When and how can AI really make a difference?
Can AI improve public policymaking? It’s a classic answer to any essay question to say ‘possibly, sometimes, for some things’, but that’s probably where we are.
It’s overly simplistic to think that more AI will easily equate to more efficient decision making, especially in the complex system that is policymaking. Despite the hype, we simply don’t have the evidence to determine whether AI can improve productivity in practice, or whether it’s better or faster than humans at the sort of tasks associated with policymaking. We do, however, have evidence that successfully adopting technologies requires hard work, time and costs beyond the sticker price.
We need to get more granular about what AI we are talking about, for what purpose, and with what data, to understand where these technologies could support policy. And to be effective, we need to consider these tools as sociotechnical, understanding how societal systems will shift with their deployment. That requires a deeper understanding of the technology in the context of these complex systems (which include real people, who don’t behave as predictable, rational actors).
Greater evidence should allow those making decisions about technology in the public sector to cut through the marketing promises and begin to treat AI like other policymaking methodologies or interventions. Where AI does successfully support public sector decision making, we must still carve out resources to complement that data-driven insight with other forms of qualitative, relational and experiential understanding from both publics and professionals.
Acknowledgements
Thanks to Yasmin Ibison, Andrew Strait, Gaia Marcus, Joe Westby and Matt Davies for thoughtful comments and contributions.
About the author
Imogen Parker is the Associate Director for Social Policy at the Ada Lovelace Institute.
This idea is part of the AI for public good topic.
Find out more about our work in this area.