Is ‘anonymisation’ enough to share NHS patient data safely?

Health insurance, Doctor working in office at hospital and visual screen technology concept life insurance medical and heal care insurance concept
image: ©The best photo for all | iStock

The UK government’s AI action plan uses public sector data, including anonymised NHS patient data and information, to drive innovation while addressing crucial data security and privacy concerns

High-quality data in large amounts is essential for training algorithms, and the success of AI depends on the strength of datasets. This is a foundational truth recognized by every programmer and acknowledged by the Government.

Labour’s recently announced AI action plan includes the creation of a National Data Library to help ‘mainline AI in the veins’ of the UK.

This Data Library is set to add public sector information, which may include anonymised NHS patient data.

AI is already being applied within healthcare to support drug development, speed up lung cancer diagnoses and improve stroke patient outcomes. Increasing health data access would excel in medical innovation.

The risks of data breaches in healthcare

Healthcare data is incredibly valuable and a prime target for cybercriminals.

Only recently, a major private provider of NHS services was hit by a ransomware gang that held over 2TB of data for a $2 million ransom. It’s the latest example in a long line of healthcare attacks that have resulted in everything from blood shortages to impacted cancer treatment wait times.

While the Government claims it will ‘responsibly, securely and ethically unlock the value of public sector data‘, is anonymisation enough? Or are there additional ways we should be seeking to secure our most sensitive information?

What does anonymised patient data mean?

Anonymised data means removing names, addresses and any other Personally Identifiable Information (PII) from the data.

Instead of using names, unique identifiers are implemented to ensure that the data remains identifiable and can be anonymised if necessary.

If fully anonymised, the data falls out of scope of GDPR laws, making it legally easier to handle.

The balance between innovation and privacy

There’s a risk of record linkage: anonymised data being cross-checked against non-anonymised data from other sources. For example, someone may have their age, sex and postcode information appear in an anonymised health dataset and this same information appears in non-anonymised, public records (e.g. social media data). If these two datasets are compared, the anonymised information could be de-anonymised.

Then there’s the added challenge that statistical health studies need some PII to produce meaningful results. Removing all PII may ultimately reduce the usefulness of studies if researchers cannot conduct cohort tracking or factor analysis, for example. No wonder Alan Duncan, of the Alan Turing Institute warned that anonymising health data ‘has to be done very carefully‘. Many members of the public are also wary about sharing health information. Currently, around 6% of NHS patients have chosen to have their health data excluded from research and planning in England.

This figure risks rising if the Government can’t reassure the public that their data is safe.

Adding another layer of protection

With standard anonymisation far from fail-safe, it’s worth ministers exploring additional protections as they refine the details of their AI plan.

One solution is Fully Homomorphic Encryption (FHE), an emerging encryption technology that allows computations to be performed directly on encrypted data, eliminating the possibility of data breaches.

Unlike traditional encryption methods, patients’ electronic health records, genetic data, medical images, lab results, and other sensitive patient data can be processed without ever exposing the raw data to potential attackers.

Ensuring data remains encrypted even during processing reduces the chances of malicious actors, whether insiders or outsiders, accessing or interpreting sensitive information. Still, one of its primary benefits is secure data sharing.

But if FHE is an ideal complement to anonymisation, why has the Government yet to embrace it?

FHE is a newer and more complex technology than anonymisation, and its standardisation is still in progress. However, the adoption of FHE is increasing, and its technology has now become simple enough for healthcare workers.

Securing sensitive data for AI progress

As the Government’s new AI plan develops, exploring its diverse use cases and significant security advantages is crucial.

There’s an opportunity here for the Government to work with stakeholders across the FHE space, from technology providers to industry bodies, to create standardised FHE solutions that protect UK data.

In addition to FHE, the Government should also be levelling up their anonymisation. Differential Privacy, or k-anonymity, is an improved anonymisation technique that conceals individual data points in a dataset.

It can be used very effectively in conjunction with FHE to train machine learning models on sensitive data.

The Government’s ambitious AI plan has huge potential for the UK. However, the data it relies upon must be treated with utmost care. Standard anonymisation isn’t enough to share sensitive public information, and the Government must seek extra ways to secure individuals’ data. Otherwise, opening the door to innovation could unleash more than intended.

Contributor Details

Upcoming OAG Webinar

LEAVE A REPLY

Please enter your comment!
Please enter your name here