Part I – Introducing a new copyright exception for TDM
The primary purpose of the consultation is to test the government’s proposal for a new TDM exception, borrowing largely from the EU exception adopted in the EU DSM Directive in 2019 and from the transparency provisions enshrined under the EU AI Act in 2022.
The consultation does not propose a definition for TDM or explain why TDM is a critical component for the AI industry besides briefly explaining that the use of automated techniques to analyse large amounts of information is often referred to as “data mining” and that where the process involves a reproduction of a copyright work, permission is needed from its owner under copyright law.
While the consultation acknowledges that “the copyright law in this area is disputed”, it offers regrettably little insight into the nature of the dispute and what arguments are at stake with respect to foundational concepts of copyright and information law. Instead, the consultation makes a number of statements which appear to be designed to sidestep the most difficult questions, including that “the use of copyright works to train AI models has given rise to debate, in the UK and around the world […] on the extent to which copyright law does or should restrict access to media for the purpose of AI training”, that creators expect to be “treated fairly” or that “it is essential that AI supports, and does not undermine, human creativity and the creative industries”.
Against this background, the consultation proceeds to spotlight several key areas of focus:
1. Key features of the proposed exception for TDM
Following in the footsteps of EU law, the government proposes that the new TDM exception should have the following key features:
- It would apply to TDM for any purpose, including commercial use.
- It would apply only where the TDM user has lawful access to the works, enabling right holders to seek remuneration at the point of access (e.g., through subscriptions or licences).
- Right holders could opt out of the exception “through an agreed mechanism” and decide that it does not apply to the copies they hold.
- It would be underpinned by transparency about the sources of training material.
There are no incentives for right holders to refrain from opting out, nor does the consultation discuss the implications of introducing an exception, which right holders have the power of rendering meaningless before it even passes through Parliament.
On a practical note, however, the government’s approach seeks to address some of the uncertainties of the EU opt-out regime. For instance, the consultation suggests that it may be willing to explore adjustments to the exception, to ensure that the effects of a rights reservation apply to a work more broadly rather than applying to each copy of the work separately; or that right holders may be able to opt out from specific AI uses of their works, e.g., for generative AI purposes, but not from all TDM, such as indexing or language training.
2. Technical standards for opting out
The consultation acknowledges the potential of technologies to enable right holders to reserve their rights but raises concerns about the current lack of standardisation in this area, as well as the challenges faced by right holders in knowing how to validly reserve rights under the EU DSM Directive. As in the EU text, UK policymakers propose that rights in online works “should be reserved using effective and accessible machine-readable formats, which should be standardised as far as possible”.
The proposal considers several machine-readable methods, noting their potential benefits and shortcomings, for instance using the robot.txt standard at the site level, expressing opt out in metadata associated with individual works or notifying AI firms directly about the lack of consent with respect to specific works by using online registries. The government acknowledges that regulation may also be required to support the adoption of standards and to ensure that protocols and metadata that are used to reserve rights are recognised and complied with.
3. Contracts and licensing
For all other works (i.e., those not available online), the text only refers to “agreed mechanisms” and contemplates measures that should be introduced to support good licensing practices between right holders and AI developers, either directly or through a third party such as a publisher or collective management organisation to whom they assigned or licensed their rights. While the consultation does not expressly deal with how contracts should be used to reserve rights, it focuses on a more recent issue that has arisen in the context of the EU exception: the question of who is entitled to opt out. As highlighted in the consultation, “the party with ultimate control over how and whether rights are reserved will […] often not be the original creator or performer, but the party to whom their works have been licensed”. Unlike in the EU, the consultation seeks to address this topic by proposing that the individual creator or performer – not just the copy holder – should have the opportunity to agree with their licensees whether they permit their work to be used for TDM.
Further, the consultation acknowledges the role that collective licensing and collective management organisations could play in giving access to large volumes of content to AI developers but also that “new structures may be needed to support the aggregation and licensing of data for AI training purposes”. To that end, the government invites views on “the role of collective licensing and aggregation/brokering services in providing access to copyright works and remuneration for right holders”. Consistent with its overall cautious tone, the consultation does not directly invite respondents to discuss the nature of the licensing scheme (compulsory or optional), but question 15 – “Should the government have a role in encouraging collective licensing and/or data aggregation services? If so, what role should it play?” – appears to steer respondents toward considering this possibility.
4. Transparency
Transparency over AI models is a cornerstone of the government’s proposal, and it unsurprisingly borrows some of its key features from the EU AI Act. However, it goes significantly further, as the proposed requirements are not limited to “general purpose AI” and would apply to all AI models, regardless of their size or the computational power required for training them. Suggested measures include:
- Disclosure of specific works and datasets used in AI training
- Disclosure of web crawler details, including ownership and purpose
- Record-keeping requirements to demonstrate compliance with rights reservations
- Requirements to provide certain information on request
- Evidence of compliance with opt-outs
The consultation further acknowledges that transparency may at times present practical or legal challenges, for example when confidential contracts are entered into or when disclosure would compromise trade secrets, and seems to suggest that these risks may be tackled by adopting a “high-level summary” approach, as enshrined in both the EU AI Act and California’s Assembly Bill 2013.
5. “International interoperability” of copyright
In a segment entitled “Treatment of models trained in other jurisdictions”, the consultation raises the hotly-debated issue of copyright territoriality. Mirroring concerns raised by the EU, the UK government acknowledges the potential need to “establish a level playing field between providers of models which are trained within the UK, and those trained outside the UK but made available for use in the UK” explaining that, otherwise, “developers which train their models in the UK [may be] at a disadvantage”.
In contrast to the EU, which mandated the application of EU copyright law to foreign-trained models available within its jurisdiction, the UK’s proposal thus far is limited to “encouraging” AI developers operating in the UK to comply with UK law on AI model training, even if their models are trained in other countries and suggests that such encouragement should be facilitated through the introduction of the concept of an “internationally interoperable” copyright to be discussed with the EU, the U.S., and under the auspices of the G7 and G20 serving as the primary forums for future copyright discussions.1
6. TDM for research
The consultation closes its segment relating to TDM issues with a short section on TDM for research purposes. It highlights that UK copyright law already provides a specific exception for data mining for non-commercial research in section 29A CDPA (without a right to opt out) and invites views on whether this exception should be further aligned with the EU’s exception on TDM for research purposes established under the EU DSM Directive. It notes in particular that:
- The UK’s exception applies to both research institutions and researchers themselves, unlike the EU’s exception, which applies to organisations and institutions only.
- Unlike the UK’s exception, the EU exception also covers commercial research and applies to databases as well as copyrighted works
Part II – Regulating AI outputs
The second half of the UK government’s consultation shifts its focus to AI outputs. After briefly asserting that the existing copyright framework seems well-suited to address the copyright issues stemming from AI-generated content, it moves on to explore the following related concerns:
1. Protectability of AI-generated content by copyright
The consultation confirms the classic view held among copyright experts and academics that human creators making use of an AI tool are protected “in similar terms” in the UK as they are in the EU and U.S. The consultation does not address the debates that prompted the U.S. Copyright Office to publish its Copyright Registration Guidance for Works Containing AI Material and treat AI-generated elements of a work similarly to works in the public domain. However, it suggests that a comparable approach might be adopted in the UK when assessing the protectability of a work.
The consultation also notes that certain “entrepreneurial works” – such as sound recordings, films, broadcasts and published editions – are entitled to protection irrespective of the level of human input involved. As a result, the individual or entity that arranged for their production through AI will retain the right to assert copyright ownership over these works.
The section on protectability closes by revisiting the controversial “computer-generated works” provision under section 9(3) of the CDPA. This provision has long been criticised for its inherent contradiction – how can a work without a human author be considered “original”? The government seeks feedback from those who have relied on this provision, hinting that its removal is a possible outcome if the consultation does not reveal sufficient benefits.
2. AI output labelling
The government’s text supports the introduction of AI output labelling obligations, citing the transparency provisions in the EU AI Act as a possible source of inspiration while acknowledging that both quantitative and technical challenges associated with AI output labelling may arise. No further details are offered at this time.
3. Deepfakes, digital replicas and the possibility of introducing “personality rights” in the future
Digital replicas are defined by the text as “images, videos and audio recordings created by digital technology to realistically replicate an individual’s voice or appearance”, emphasising how data protection law or the tort of passing off may serve to protect these attributes. It also notes that some within the creative industries have been advocating since 2022 for the introduction of “personality rights” in the UK to grant individuals greater control over the use of their likeness or voice. The text invites respondents to share their views on this topic, while clarifying that it is not the primary focus of the consultation and that it “would be a significant step” requiring careful consideration.
4. Other emerging issues, including inference and synthetic data
The consultation concludes by addressing two issues described as “emerging”: inference and the use of synthetic data to train AI models.
It defines inference as “the process by which a trained AI system generates outputs using new data” and explains that “AI products can interact with copyright works at inference”, for example if a user includes a copyright work in their prompt to the system or prompts the system to summarise a news publication using retrieval-augmented generation. The text offers no insight into potential issues, but stakeholders are unlikely to have missed the subtle signal that this may hint at the contentious topic of private copying returning to the agenda.
By way of background, an attempt to introduce a private copying exception under UK law was made in 2014, allowing individuals to make personal copies of works they had lawful access to. However, this was struck down by the High Court in 2015 after being successfully challenged by right holders. Without a private copying exception, UK law remains more restrictive in this area than many EU countries, and personal copying for private use (including for prompting a model) is not generally permitted without the right holder’s explicit consent.
As a final note, the government mentions the increasing use of synthetic data (ineligible for copyright protection) to train AI models and invites respondents to provide comments on “how this may affect the functioning of the licensing ecosystem and the UK copyright framework more broadly”. With over six years lagging behind the EU in implementing its TDM exception and the methods, pace and volume of AI training changing rapidly, the government seems to be inviting comments on whether the proposed framework risks becoming outdated before it even clears Parliament.
- The current international situation, impacted by, among others, divergent policies and unequal standards, makes any attempt at updating copyright treaties highly unlikely, which may explain the shift toward the G7 and G20 as alternative forums for addressing these issues.
In-depth 2025-015