An increasing number of court cases in the EU and the UK address the challenging interface between AI and copyright. While there are fewer in the EU and UK than in the US which currently has more than 40 (mostly still pending) proceedings, numbers are on the rise and you can track recent developments in the Taylor Wessing AI & Copyright Tracker. Here we provide a brief overview of the most relevant developments, trends, and key legal questions.
The text and data mining exception: the central battleground
A critical issue across nearly all EU AI copyright cases is whether the EU's text and data mining (TDM) exceptions (Articles 3 and 4 of the EU Directive on Copyright in the Digital Single Market (DSM Directive), implemented in national laws) permit the use of copyrighted works to train AI models.
Courts are confronted with whether TDM exceptions extend to the training of generative AI. Early decisions suggest a willingness to interpret these exceptions broadly, with important caveats around opt-out mechanisms:
- GEMA v OpenAI (Germany): on 11 November 2025, the Munich Regional Court ruled that the TDM exception generally covers TDM for the purposes of AI training. However, if an AI model actually reproduces complete training data ('memorisation'), which was seen to be the case here, such use falls outside the exception's protection.
- LAION (Germany): in September 2025, the Hamburg Regional Court ruled that creating datasets for AI training purposes qualifies as text and data mining for scientific research under Section 60d of the German Copyright Act, which implements Article 3 of the DSM Directive, and is therefore permitted without the possibility of opting out. The court indicated that the TDM exception likely covers AI training generally.
- Dutch Media Publishers v Knowledge Exchange B.V. (Howards Home) (Netherlands): the Amsterdam District Court held in 2024 that a news aggregator offering AI-created summaries can rely on the TDM exception and emphasised that opt-out declarations must be specific and properly implemented. The court also denied a violation of the Berne Convention’s “Three-Step-Test” set out in Article 5(5) of the EU InfoSoc Directive.
Opt-out mechanisms and machine-readability
The TDM exception under Article 4 DSM Directive allows rightsholders to opt out of text and data mining, but the practical implementation and legal requirements for valid opt-outs are contested.
Courts are requiring clear, specific, and technically proper opt-out mechanisms. General or ambiguous declarations will likely not be sufficient. However, the exact technical standards are still in dispute. Notably, the following decisions were issued before the release of the General-Purpose AI Code of Practice, which specifically addresses the robots.txt exclusion protocol. Pending cases also discuss whether licensees can validly declare an opt-out:
- LAION (Germany): in 2024, the Hamburg Regional Court, in an obiter dictum, accepted that a machine-readable opt-out can be declared in a website’s terms and conditions using natural language (i.e. plain words). The appellate court, where the case is currently pending, appears to lean towards a stricter view, which may require the use of the robots.txt standard.
- Dutch Media Publishers v Knowledge Exchange B.V. (Howards Home) (Netherlands): the defendant disputed that the prohibition on automatic searches used by the claimant was legible in the robots.txt format, and that it only excludes certain AI bots. The court followed this approach and implied that the valid opt-out must be proven by the rightsholder relying on it.
- Hungarian Search Engine Case (Hungary): in a 2024 ruling, the court found that the plaintiff had not opted out of TDM in the form required by law, allowing the defendant's use to continue under the exception.
'Memorisation'
A recurring question is whether LLMs and generative AI systems actually reproduce their training data or substantial portions of it. While it appears accepted that GPAI models do not, as such, store verbatim copies, courts are grappling with the 'memorisation' of protected works.
- GEMA v OpenAI (Germany): the Munich court ruled that ChatGPT actually “reproduces” copyright-protected song lyrics from its training data. Following the training, lyrics were ”memorised” and thus reproduced in the AI model itself, even if this required several intermediate steps and even if the lyrics were only reflected in the model’s parameters. The court inferred such reproduction from the fact that the outputs contained the lyrics. The decision stands in marked contrast to the High Court’s decision in Getty (UK) (see here), where the court ruled that there is no storage of data/works, but rather of patterns, that the model weights do not directly store the pixel values of the training images, and that there was no evidence of a copyrighted work having been “memorised” by the model.
- Like Company v Google (Hungary/CJEU): this first actual AI copyright case to be referred to the CJEU was published in 2025. Although mainly focused on the press publisher’s right under Article 15 of the DSM Directive, it also touches on reproduction issues and may have a major impact on the interpretation of Article 4 DSM Directive. In its request, the Hungarian court asks whether the process of training an LLM-based chatbot constitutes an instance of reproduction, where that LLM is built on the basis of the observation and matching of patterns, making it possible for the model to learn to recognise linguistic patterns. If the answer to that question is in the affirmative, the court further asks whether such reproduction of (in that case lawfully accessible works) falls within the TDM exception provided for in Article 4 DSM Directive.
Training vs. output
Courts and litigants are distinguishing between two different stages of potential copyright infringement: (1) the use of copyrighted works during AI training, and (2) the generation of outputs that may infringe copyright. If the output includes protected content (which may be seen as another instance of 'memorisation'), the question arises who is responsible: the actual user prompting the model to generate the output, or the model provider.
- GEMA v OpenAI (Germany): the court took the view that there are several distinct forms of copyright infringement: protected lyrics used as training data were reproduced in the model itself. In addition, the outputs, which contained substantial parts of the training data if prompted by end users, constitute a reproduction, as well as a “making available to the public” of the lyrics. Although the outputs were generated by the end users' prompts, OpenAI was seen as responsible, as it played a central role in creating the outputs.
- Like Company v Google (Hungary/CJEU): the question of responsibility for a chatbot’s output was also put to the CJEU in the Hungarian court’s request, asking whether the DSM Directive and the InfoSoc Directive should be interpreted as meaning that, where a user gives an LLM-based chatbot an instruction which matches the text contained in a press publication, or which refers to that text, and the chatbot then generates its response based on the instruction given by the user, the fact that, in that response, part or all of the content of a press publication is displayed constitutes an instance of reproduction on the part of the chatbot service provider.
- Getty v Stability AI (UK): in proceedings brought by Getty Images against Stability AI in the UK, Getty abandoned its claims of primary copyright infringement arising from the training of Stability's model and outputs generated by that model. This left Getty's claim of secondary infringement, treating the model weights themselves as "infringing articles", however, this was unsuccessful. The court held that "Although an "article" may be an intangible object... an AI model such as Stable Diffusion which does not store or reproduce any "Copyright Works" (and never has done so) is not an "infringing copy" such that there is no infringement...". Had Getty been able to show that the training of Stability's model occurred in the UK, the outcome may have been very different. Likewise, had the output claims related to more creative works and there been a greater link between those works and the allegedly infringing outputs, there might have been output infringement.
Beyond copyright: personality rights, trade marks, terms and conditions
Rightsholders are also pursuing alternative legal routes to enforce their interests.
- Voice Actor v YouTube Creator (Germany): the Berlin Regional Court held in 2025 that the unlicensed use of an AI-generated voice strikingly similar to an actor's real voice infringes personality rights, and that freedom of expression could not justify commercial use.
- Getty v Stability AI (UK): Getty won on double identity infringement and likelihood of confusion for various historic models and lost on its reputation-based claim. However, the facts were very specific to the case and this does not necessarily mean all trade mark infringement claims in the UK will fail.
- An association of German journalists filed a lawsuit against a newspaper, claiming that specific AI-related clauses in the newspaper’s terms and conditions should be seen as invalid. Specifically, the claims argue that an obligation on journalists to grant the newspaper the right to use their works for AI training, creates an unfair disadvantage.
The evolving legal landscape
It would be both an exaggeration and an understatement to say that the legal landscape of AI and copyright cases is evolving. An exaggeration, as the decisions so far have barely touched the variety and complexity of the legal ramifications; and an understatement, as the number of AI and copyright cases has increased significantly in the past years. One point is certain: the underlying questions will be with us for a while. Stay tuned with our AI & Copyright Tracker.