In the discovery phase of scientific research, AI systems can be used to review patents and identify potential new medicines for clinical trials. If an organisation develops its own AI system internally without using patient data, the AI Act should not apply, as it would under the research exemption. Similarly, GDPR is not relevant in such cases as no personal data is processed.
However, when patient data is used to train an AI system, GDPR becomes applicable. Patient data can be sourced from various places, including healthcare records collected by institutions such as the NHS, voluntary registries where patients consent to sharing their data for research, or data obtained from previous clinical trials. However, these datasets were originally collected for specific purposes, necessitating an assessment to determine if they can be repurposed for training AI models.
Assessing compatibility for secondary use of patient data
The GDPR includes a purpose limitation principle, which mandates that personal data must be collected for a specific, explicit, and legitimate purpose and not further processed in a manner that is incompatible with that purpose. If the data is to be reused for a different purpose, a compatibility assessment is required. Several factors influence this assessment:
- Link between the initial and secondary purposes: the closer the connection, the more likely it is to be deemed compatible.
- The context and the reasonable expectations of data subjects: if data subjects were informed at the time of collection about potential secondary uses, reuse is more likely to be justified.
- Nature of the data: the more sensitive the personal data (such as health data), the narrower the scope for compatibility will be.
- Consequences for data subjects: both positive and negative consequences must be evaluated.
- Existence of appropriate safeguards: measures like encryption, pseudonymisation, transparency, and opt-out options should be considered.
The purpose limitation principle aims to maintain individuals' control over their data and prevent unauthorised repurposing. If secondary use aligns with reasonable expectations, it is more likely to be deemed compatible. Notably, scientific research is generally considered a compatible secondary use, provided that appropriate safeguards are in place.
Scientific research and GDPR compliance
The GDPR does not explicitly define 'scientific research', but Recital 159 suggests a broad interpretation, covering technological development, fundamental and applied research, and privately or publicly funded studies. The European Data Protection Board (EDPB) advises that scientific research must adhere to established ethical and methodological standards. Both the discovery phase and clinical research typically follow strict methods or protocols so they should qualify as scientific research.
Although the EDPB had planned to issue guidance in 2021 on the definition of scientific research and on the appropriate safeguards to be adopted, this has not yet materialised. Given this uncertainty, it is advisable not to assume automatic compatibility but to conduct a thorough compatibility assessment.
Compatibility assessment outcomes
If the secondary use is incompatible with the initial collection, the data cannot be re-used for the secondary purpose unless such processing can be based on the consent of the data subject or a Union or Member State law safeguarding important objectives of general public interests (eg public health).
If the secondary use is compatible with the initial collection, one can rely on the legal basis relied upon for the initial collection of data. However, all other data protection principles must still be respected for the further processing, including giving information to data subjects on the further processing and on data subject rights as well as performing a data processing impact assessment if required. The compatibility assessment and measures adopted must be documented to meet the accountability principle.
Exceptions under Article 9 of the GDPR for processing health data
Even when a secondary use is deemed compatible, an additional exception is required under Article 9 of the GDPR for processing health data. Potential exceptions include:
- Explicit consent (art. 9(2)(a)).
- Reasons of public interest in the area of public health based on Union or Member State law (art. 9(2)(i).
- Scientific research based on Union or Member State law and adoption of appropriate safeguards (art. 9(2)(j)).
Conclusion
Using patient data for the development or training of AI systems used for scientific research requires careful regulatory consideration. When patient data is involved, GDPR compliance becomes crucial, and organisations must assess the compatibility of secondary data use in addition to the other GDPR principles and obligations.