The concept of personal data, its parameters and how to anonymise data is a perennial topic for data protection practitioners and has resonance across all sectors. While the definition of personal data has been around since at least 1995, there are still debates about the breadth of its scope.
The ICO has historically taken a fairly middle of the road approach, recognising the interpretation of the European Court of Justice in its decisions while also providing guidance which reflects a more nuanced approach
Back in May 2021, the Information Commissioner's Office (ICO) produced draft guidance on anonymisation, pseudonymisation and Privacy Enhancing Technologies (PETs). On 28 March 2025, the ICO published its guidance on anonymisation and pseudonymisation, although dropped the reference to PETs from the title. Unsurprisingly, there's a lot of similarity between the March 2025 publication, the earlier consultation draft from May 2021, and the later updated version from 2022.
The key point to underline from this recent guidance (which was also in the draft version) is that the ICO allows that the status of information may change – so that information can be personal data in 'the hands' of one organisation, but anonymous data in 'the hands' of a second organisation where that second organisation has no access to relevant information, and no means reasonably likely to be used to obtain information, to identify individuals.
Structure of the guidance
The March guidance provides an introduction to anonymisation, guidance on how to ensure anonymisation is effective, information on pseudonymisation, and the accountability and governance measures to be implemented when producing and disclosing anonymous data. It also provides two case study examples at the end – one concerning recruitment analytics and one concerning customer insights for a retailer – to show how you can demonstrate pseudonymisation and anonymisation.
The ICO acknowledges that there are other ways to implement anonymisation techniques, but says it will analyse anonymisation issues in light of its guidance. So, if you are claiming data is anonymous (including where you’re using data to train AI models) and you are in discussions with the ICO, note that they will refer to this guidance.
Below we highlight core points under the different sections of the guidance.
Anonymous data
To understand the parameters of anonymous information, you need to understand what personal data means. A recital to the UK GDPR states that anonymous data is information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the individual is not or is no longer identifiable. Consequently, the concept of identifiability is central to the test of anonymisation.
The guidance distinguishes between 'anonymisation' which is a broad term to cover techniques and approaches to prevent identifying people, and 'effective anonymisation' which means the technical and organisational measures that need to be met to ensure that data meets the legal threshold for anonymisation. But the guidance allows that effective anonymisation may not always be possible due to the nature of the data, the purpose for which it was collected, used or retained, or the context of the processing. For example, in the case of medical records (as set out in the guidance), even if names and addresses are removed, there may still be sufficient details about individuals to potentially re-identify them since there could be details about dates and location of treatment, types of treatment, approximate ages of individuals etc.
The guidance confirms that the process of anonymisation itself counts as a processing activity. Therefore, a controller must have a lawful basis, define its purpose and be transparent with individuals. Two main approaches to anonymisation are described – generalisation and randomisation – although a combination of methods will usually be expected.
Effective anonymisation
In the section on ensuring anonymisation, the ICO discusses identifiability and the spectrum of identifiability. Identifiability is about whether you can distinguish one person from another person with a degree of certainty. So the guidance makes the point that a person can be identifiable even if you do not know their name, since information can still affect a person.
The key indicators of identifiability are (i) singling out and (ii) linkability. Singling out is when you can single out the same person across records or isolate the records that relate to a person from a dataset. The question is whether you or someone else can single a person out, and the context will often determine whether that is possible. Linkability is the concept of combining multiple records about the same person or a group of people. It is sometimes known as the mosaic or jigsaw effect. So data could be combined with publicly available data for someone to be identifiable. Or complex statistical methods may piece together various bits of information with the same results. Common techniques to reduce linkability include masking and tokenisation of key variables.
The ICO acknowledges that identifiability can be highly context specific, and therefore refers to a 'spectrum of identifiability' where there is a blurred band in between at one end, personal data, and at the other end, anonymous data. This is, of course, where most of the discussions take place in circumstances where parties disagree on whether data is personal data or not. The ICO will expect parties to demonstrate they have gone through a reasonably likely assessment which examines:
- whether the information is easily identifiable with readily available means
- whether there are techniques that enable identification from the information by anyone obtaining access to it
- whether there is additional information that may enable identification
- the extent to which the additional information or techniques are reasonably likely to be used by a particular person to identify people to whom the original information relates.
The ICO emphasises that data protection law does not require an organisation to adopt an approach that takes account of every hypothetical or theoretical chance of identifiability. In other words, you are not required to reduce the identifiability risk to zero. But an organisation is expected to assess objective criteria such as how costly identification is in human and economic terms, the time required and how technology may develop over time. Likewise, an organisation should consider whether someone else would be reasonably likely to identify an individual from the data, plus use the 'motivated intruder' test to consider whether a determined person with access to resources and expertise could identify individuals from the data.
Having carried out an identifiability risk assessment, an organisation should periodically review it if they are relying on it to argue that data is anonymous. The organisation is expected to monitor technological development plus any datasets that are made publicly available which could impact whether re-identification is more likely.
Pseudonymisation
Pseudonymisation was not defined in UK law prior to the GDPR but has a specific meaning now under data protection law. The guidance describes pseudonymisation as techniques that replace information that directly identifies people or de-couples that information from the resulting dataset (common types are hashing, encryption and tokenisation). This can mean replacing names or other identifiers with a reference number. Pseudonymising data therefore results in information about people who can’t be identified from that information by itself, but who can be identified from additional information held separately. The ICO recognises that there can be confusion where organisations think that they have anonymised data whereas in fact they have pseudonymised it. This is significant since pseudonymised data is still classified as personal data.
The ICO underlines the distinction by stating that anonymisation is a way of reducing the amount of personal data held whereas pseudonymisation is a way of reducing the risks associated with the personal data held. The ICO also tackles the concept of de-identified data and, while recognising it is a term widely used, does not encourage its use since it leads to confusion. De-identified data is not a term defined under UK law and can have different meanings depending on the context.
Under the guidance, an organisation implementing pseudonymisation is expected to establish what it wants to achieve and how to get there by defining its goals, detailing the risks, deciding on which pseudonymisation technique to use, deciding who will carry out the pseudonymisation and then documenting the decisions made. Again, the ICO expects an organisation to take into account threats to pseudonymisation through insider and externals threats, plus the likely goals of any attack.
Accountability and governance for anonymisation
The guidance expects organisations to carry out a Data Protection Impact Assessment as part of helping them structure and document their decision-making processes around anonymisation and to identify risks to rights and freedoms, plus mitigation strategies. In particular, if an organisation intends to disclose anonymous data, it is expected to address the practical issues surrounding the production and disclosure of information in its approach to governance. For instance:
- Will it use a trusted third party to help with the anonymisation process?
- What safeguards has it implemented to ensure identifiability is unlikely?
- Who is the senior person within the organisation who oversees the anonymisation process?
The guidance flags that different types of disclosure (open release v limited access) of anonymous data pose different risks. For instance, limited access is more appropriate when handling anonymous data derived from sensitive material.
The guidance sets out a number of safeguards that the ICO expects organisations to consider putting in place before making anonymous information available. These include:
- training personnel who will access the data
- security checks on those who access the data
- limiting data to a particular project
- prohibiting attempts at re-identification
- arranging for destruction or return of data once a project is complete
There's also a realisation that in certain instances anonymisation can be ineffective due to technological developments that undermine the anonymisation techniques implemented, or where there are other sources of data of which the organisation was unaware. The ICO expects organisations to be aware of other risks relating to the use of anonymous information in their approach to governance. There is also a requirement to explain the approach to anonymisation to individuals in the organisation's privacy notice, including any consequences it may have and what safeguards are in place to minimise the risks.
Part of the toolbox
The guidance includes a plethora of links to other resources that organisations can consult to consider their approach to anonymisation and pseudonymisation including those provided by ENISA (EU Agency for Cybersecurity), NIST (US National Institute of Standards and Technology), ISO (International Organisation for Standardisation) and UKAN (UK Anonymisation Network). While it does not contain anything startling, it is helpful to have final confirmation of the ICO's approach, particularly around changes to the status of information and their consequences.