4 November 2024
AIQ - Autumn – 4 of 7 Insights
In a decision of September 27, 2024, the Hamburg Regional Court dismissed the lawsuit of a photographer against LAION, the provider of the LAION-5B image-text dataset. The main reasons for the decision are based on the copyright exception for Text and Data Mining (TDM) for purposes of scientific research, but the decision also addresses a number of other issues, such as the applicability of the TDM exceptions to the training of generative Artificial Intelligence, the requirements for declaring a reservation of rights according to the general TDM exception, and the conditions of “machine readability”. The decision is not yet binding and may still be appealed in the coming weeks.
LAION offers the LAION-5B image-text dataset, which can be used to train large image-text models, such as Stable Diffusion. The plaintiff (a stock photographer) claimed that LAION unlawfully downloaded a photograph created by him for the purposes of creating AI training datasets and demanded a cease and desist order against the allegedly unlawful download. The dataset contains hyperlinks to publicly accessible images or image files on the internet as well as further information about the corresponding images, including an image description that provides information about the content of the image in text form. The dataset comprises 5.85 billion corresponding image-text pairs. LAION extracted the URLs to the images from this data set and downloaded the images from their respective storage locations, then used software to check the images to see whether the description of the image content already in the existing data set actually matched the content to be seen in the image. The website from which the image was downloaded contained terms and conditions that prohibited among other things the use of automated programs to access the website or any content on it by way of downloading, indexing, scraping or caching any content on the website.
The Court rejected the plaintiff’s claims, as the use was covered by the Copyright exception for text and data mining "for the purposes of scientific research" (Article 3 of the DSM Copyright Directive as implemented in German law). This exception does not allow rightsholders to opt out. The intended use qualified as “text and data mining” as defined by the law (i.e. the automated analytical technique “aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations”). The Court did not see any evidence that LAION cooperated with a (commercial) third party undertaking having decisive influence on it, and having preferential access on the search results, which would have excluded the exception. The Court expressly only decided on the legality of the download, and not on the question of the (subsequent) training of generative AI, which was not part of the claim brought.
Although further reasoning was not strictly necessary, the Court, in an obiter dictum, also gave an initial assessment on the applicability and interpretation of the “general” TDM exception (Article 4 of the DSM Directive as implemented in German law). As such, the Court accepted that LAION’s use generally qualified as text and data mining. Moreover, the Court tended to the view that the TDM exception not only covered data analysis, but, with reference to Article 53(1)(c) AI Act, also the creation of datasets for the subsequent training of generative AI. However, there likely would have been a valid opt-out declared in the terms and conditions of the website that distributed the plaintiff’s photographs. Although the opt-out had not been made by way of a programmed exclusion protocol (such as robot.txt), but in 'natural' language, the Court tended to the view that such reservation was sufficiently explicit and specific. The opt-out could also be declared by a non-exclusive licensee of the rightsholder. In addition, such reservation also likely satisfied the requirements for “machine readability” for content made available online, as there were likely state-of-the-art technologies (as mentioned in Article 53(1)(c) AI Act) available to understand natural language reservations.
The Court’s decision is the first judgement of an EU court addressing the interpretation of the TDM exception. Although the judgement may still be appealed, and although there is no rule of binding precedent in German law, the decision will very likely be taken into account by other courts in Germany and possibly beyond, as it addresses a number of controversial questions at the intersection of Copyright and AI. The scientific community will likely welcome the judgement, as it sheds some light on the scope of the TDM exception for scientific purposes under the DSM Directive. It is also noteworthy that the Court saw the TDM exception as generally broad enough to include the training of generative AI. As regards the general TDM exception that also covers other commercial purposes, the discussion of what qualifies as an expressly stated and “machine readable” opt-out will stay high on the agenda.
4 November 2024
4 November 2024
by Dr. Christian Frank, Licencié en droit (Paris II / Panthéon-Assas) and Dr. Gregor Schmid, LL.M. (Cambridge)