Authors

Alexander Schmalenberger, LL.B.

Knowledge Lawyer

Hamburg

Dr. Jakob Horn, LL.M. (Harvard)

Associate

Berlin

Authors

Alexander Schmalenberger, LL.B.

Knowledge Lawyer

Hamburg

Dr. Jakob Horn, LL.M. (Harvard)

Associate

Berlin

1 July 2025

Training of AI models with user data - Decision of the Higher Regional Court of Cologne in the Meta case

Briefing

Meta - Operator of Facebook and Instagram in particular - has been using public user data to train its own artificial intelligence (AI) models since 27 May 2025. In particular, the NRW Consumer Rights Organisation (VZ NRW) raised data protection concerns. It therefore wanted to prohibit Meta from using the user data for AI training by means of a temporary injunction.

The Higher Regional Court (OLG) of Cologne dismissed the application for the injunction on 23 May 2025 (TW reported). The reasons for the decision have now been published. The reasoning contains important points of reference for all companies that want to train an AI with the help of user or customer data.

Legal background

Anyone who processes personal data such as posts on Facebook requires a legal basis. Training an AI also constitutes such data processing. Meta bases its AI training on its "legitimate interest", namely the economic interest in developing and marketing a new AI product - more precisely, a conversation assistant.

The NRW Consumer Rights Organisation objected to this assessment. It took the view that the legitimate interest here was not sufficient to justify the data processing. In addition, Meta also processed so-called health data during the AI training, which was not permitted without express consent. In addition, the consumer advice centre argued that there was a violation of the Digital Markets Act, as Meta was unlawfully combining user data.

The OLG rejected the concerns of the consumer advice centre and upheld Meta's legal assessment.

What companies need to know now

The OLG's decision offers opportunities for companies that want to train AI models with user data - and not just for social networks. The decision strengthens the legal position when training AI models with user data.

In particular, the court has now established that AI training with user data can be a legitimate interest of companies. The court also refers to the AI Act - the central regulation of AI in the EU - which itself refers to the need to train AI models with huge amounts of data.

At the same time, the judgement does not grant a "free pass" for AI training. Rather, the OLG emphasises that companies must formulate their legitimate interest "sufficiently clearly and precisely" and that this interest must be real. Furthermore, companies must demonstrate that it is necessary to process the personal data in order to fulfil the interest. There must therefore be no less intrusive, equally suitable alternatives to the use of user data. Meta, for example, was able to credibly demonstrate that it had examined alternatives, such as the use of synthetic data, but that these were not equally suitable.

In addition, the legitimate interest must be weighed against the interests of the data subjects. Meta has taken various measures here to ensure the protection of the data subjects. For example, Meta has de-identified the data, i.e. removed direct identifiers such as name or telephone number from the data and granted the data subjects an effective right of objection before the start of the AI training. Furthermore, Meta only uses public user data, i.e. data that could be used by anyone anyway. It would therefore not be so serious if the AI were to reproduce this data later.

Another key point for the OLG was that users could expect their data entered from 26 June 2024 onwards to be used for AI training. On 26 June 2024, Facebook informed users of its intention to use the user data for AI training. Therefore, if the use of data for AI training is already being considered during data collection, companies should consider explicitly informing users of this. A general reference in the privacy policy can be helpful here, but may not be sufficient because these references are usually not noticed. Ultimately, however, the degree of transparency also depends on what the AI is being trained for. For example, Meta is planning to train an independent AI model that will be marketed independently of Facebook and Instagram. The OLG suggests that the transparency requirements may be lower if the training serves to improve the product for which the user data is collected. In this case, it would be more likely that users would expect the AI training anyway.

Conclusion

The decision of the Higher Regional Court of Cologne shows that it is possible to use user data for AI training without having to obtain the user's consent. However, it does not grant a "free pass". The requirements of the GDPR must be carefully examined on a case-by-case basis.

Anyone planning to use customer or user data for AI training should therefore seek legal advice at an early stage.