Skip to main content

Anonymisation and Pseudonymisation

This page provides information and guidance on anonymising and pseudonymising identifiable research data.

Background

It is important to provide research participants with assurances about how their data will be used and their options for their name and other personal details being held securely within the project and not being made public. There are differences between anonymisation and pseudonymisation. This page provides further details about this.

Anonymity

When data is collected and held anonymously, there are no identifiers that can link the information collected to the participant; not even the researcher could identify a specific participant.  

This can be because you are not asking your participants to give you any information that can identify them – for example, you may not ask for names or contact details (direct identifiers).  However, be aware that the collection of too many demographic variables (or indirect identifiers) could limit the anonymity of participants when combined. For example, knowing that a survey respondent in a staff survey is female, has been at the University for less than 2 years, and is in a particular Department could identify them.  

If you are collecting identifiable information, you may be able to anonymise this data by removing personal identifiers (both direct and indirect) after the data have been collected. It is ordinarily best practice to anonymise data if there is no reason to re-identify data subjects.  

Where data have been truly anonymised and individuals are no longer identifiable, the data do not fall within the scope of GDPR. The Information Commissioner's Office (ICO) suggest using the ‘motivated intruder’ test to consider whether data have been fully anonymised. 

In an anonymous study, the researcher needs to indicate how the participants will be kept anonymous. Be certain to include in your information sheet that no participants will be identified, and explain the implications for the participants’ ability to withdraw their data. Because individual contributions cannot be identified, withdrawal of data is not possible.

Pseudonymisation 

Pseudonymisation is where any identifying characteristics of the data are removed, replaced or transformed, and kept separate from the data itself. For example, replacing identifiable information with pseudonyms may include using numbers, codes or fictitious names, and producing and maintaining a key to allow reidentification. The key should be stored securely and separately to the treated dataset.

Data that has been pseudonymised is still considered to be personal data and falls within the scope of data protection legislation. It is important to remember the difference between pseudonymisation and anonymisation. Pseudonymisation is a way of reducing risk and ensuring appropriate data security, but it does not transform personal data to the extent that the relevant legislation no longer applies. In many cases, pseudonymisation still enables re-identification via indirect methods. 

Further information

You can find more information and guidance on anonymisation of personal data:

For more information about pseudonymisation:

Find our more about research ethics and integrity

click here

On this page