How synthetic data can facilitate AI adoption in government

January 17, 2025

Digital chatting artificial intelligence chatbot. Chatbot, A.I, robot application, conversation assistant concept. AI, Robot application and global connecting. Human feedback learning technology. — image: © Parradee Kietsirikul | iStock

Despite challenges in AI adoption within the government, synthetic data has proven to be a game-changing tool that can drive innovation while also safeguarding privacy

It’s no secret that artificial intelligence (AI) is able to transform the governments’ delivery of public services, with the potential to provide ‘billions of pounds in productivity savings’, according to the National Audit Office (NAO).

Offering a wealth of benefits, AI can generate meaningful insights from masses of data, automate routine tasks, and pave the way to more effective customer service, not to mention its unique ability to detect cases of fraud in real time.

Yet recent research has found that, despite the promise of the technology, the slow uptake of AI in government is hindering its strategic goals. Only one in five government officials believe ‘significant’ progress is being made in key AI-related areas, with generative AI faring particularly low (12 per cent).

Outdated IT infrastructure, budget constraints, skills gaps, operational procedures, and data issues are all proving key challenges in the adoption of AI, with issues of data quality and consistency also prevalent.

Access to large, diverse and authentic data is crucial for training robust AI models. Yet getting that kind of real-world data in the public sector can be tough due to strict privacy requirements, legal restrictions, and high data acquisition and annotation costs.

There is cause for optimism, however, as the use of synthetic data offers a way to overcome these hurdles and facilitate the use of AI to positively impact service delivery for citizens across the UK.

What is synthetic data?

Synthetic data is a product of generative AI itself, defined as algorithmically generated data that mimics real-world data.

Private personal information (PPI) is closely guarded in the public sector, meaning that where departments can’t risk exposing PPI they need replacement data to analyse. AI can fulfil this need by generating data that replicates the patterns and signals of the original without its specific characteristics, creating a facsimile rather than a direct copy.

These datasets are highly valuable for testing and training accurate predictive models without the need to obscure sensitive information. This ‘synthetic twin’ method approach ensures near-perfect anonymity while helping to mitigate bias.

There are a wide range of use cases for synthetic data in government departments. For example, transport departments can use synthetic data of simulated traffic flows to test road improvements with what-if scenarios, even if they only have a few months of traffic data. Or in the case of removing the need for PII data, welfare agencies could identify the characteristics of population cohorts most in need of new benefits, without using real citizen data.

Putting it into practice

Synthetic data allows teams to experiment with use cases – and learn from the findings – to inform where they want to invest in AI. With budget a primary concern for decision makers, this eases the pressure of a larger upfront investment without seeing what the results could be.

At the same time, experimenting with synthetic data through guided platforms like SAS Viya Workbench is a clear opportunity to upskill and support teams to build their technical knowledge. It provides a means to learn, test and develop new solutions, without requiring access to sensitive data which has been a traditional barrier.

Employees working in healthcare could model hospital capacity planning and resource distribution during emergencies using synthetic patient data, while those focused on welfare could simulate the effects of policy changes in the population.

The opportunities are endless and help to improve the speed and accuracy of decision making for meaningful change. Naturally, synthetic data also addresses common data issues by eliminating privacy risks, overcoming data scarcity, and improving upon the errors, inconsistencies or noise found in real-world data.

Looking at the wider picture

The use of synthetic data is just one component in the process of effective AI implementation in government departments. As outlined in SAS’ recent research report Data & AI in the UK government, there are five key steps needed to capitalise on enthusiasm and enable AI adoption in government.

Firstly, departments should plan and allocate resources to the project, accounting for factors like budget and scalability. The next step is inspiration and use case selection, where the team can assess each cases’ value against its complexity and work out whether data collection will be straightforward.

Then, departments will want to develop an AI prototype, focused on rapid iteration and built using non-personally identifiable information, non-operational data and possibly synthetic data. After this has been developed, during implementation, scaling the solution and making continuous improvements is key.

Finally, the assessment phase evaluates the changed way of working in a department and looks forward to scaling out and maintaining the solution.

Keeping an innovative mindset makes this all possible, and being open to explore the possibilities of new forms of AI like synthetic data to ultimately achieve the best outcomes. While a subsequent study by SAS has found interest in synthetic data is low in government – with 32% of decision makers responding they would not consider it – its relevancy for departments means this sentiment could be stifling AI’s potential to transform operations for good.