
In August 2025, Anthropic did something unprecedented. The AI company agreed to pay $1.5 billion, with preliminary court approval following in September, to settle copyright claims from authors whose books trained its Claude chatbot. At roughly $3,000 per book for an estimated 500,000 works, the settlement represents the first major resolution in the escalating legal battle over whether AI companies can use copyrighted material freely for training. The deal fundamentally reshapes the economics of AI development, proving that companies can afford to compensate creators without halting innovation. But this settlement represents just one resolution in a sprawling legal conflict where different courts have reached contradictory conclusions about identical questions. [1]
Critically, the settlement only releases Anthropic from liability for past conduct related to pirated works through to August 2025 and does not create any licensing scheme for future AI training, cover claims based on AI outputs, or affect Anthropic’s ability to train on lawfully acquired materials.
When Judges Disagree: The ‘Fair Use’ Puzzle
The Anthropic settlement emerged from Bartz v. Anthropic, where Judge Alsup had previously ruled in June 2025 that training AI on copyrighted books constitutes fair use -characterising the practice as “quintessentially transformative” and even “spectacularly so.” Just two days later, Judge Chhabria in Kadrey v. Meta reached similar conclusions about Meta’s Llama model, finding the training “highly transformative” despite Meta having sourced books from unauthorised pirate repositories. [2] These judges saw AI training as fundamentally different from the purpose of the original works: building language understanding systems rather than publishing books for readers.
Yet barely four months earlier, on 11 February 2025, U.S. Circuit Judge Stephanos Bibas – sitting by designation in the U.S. District Court for the District of Delaware – ruled the exact opposite in Thomson Reuters v. ROSS Intelligence. [3] He held that ROSS’s AI-powered legal research tool did not qualify for fair use because it served the same commercial purpose as Thomson Reuters’ competing product. Where copying serves to build direct market competitors, Judge Bibas reasoned, it looks like appropriation rather than transformation.
This divergence creates difficult uncertainty for anyone in the AI ecosystem. Identical conduct – training models on copyrighted works – has been treated as fair use in some Northern District of California cases yet rejected as unfair in the District of Delaware on a competitor-use fact pattern. Companies attempting legal compliance face contradictory signals depending on jurisdiction, evidentiary record and the perceived proximity of the AI system to the rightsholder’s market. Appeals are likely to shape the contours of a workable standard, but until then, the risks remain highly fact‑dependent.
The contradiction reveals that courts fundamentally disagree about what AI systems do. Judges favouring AI companies emphasise generative capabilities – the systems’ ability to produce novel outputs and assist with unprecedented tasks. Judges ruling against AI companies focus on competitive harm and market substitution – when AI serves identical functions as training works then copying appears more like theft than innovation. Both perspectives cite established precedent, both claim fidelity to copyright principles, and both present logical reasoning.
Music’s Unique Challenge: When AI Learns to Sing

While text-based AI battles dominate headlines, music presents distinct challenges because audio carries different cultural and economic significance. Three major record labels -Universal Music Group, Sony Music, and Warner Music Group—filed lawsuits in June 2024 through the Record Industry Association of America (‘RIAA’) against AI music generators Suno and Udio. [4] The allegations are that these services trained on copyrighted sound recordings to create systems generating remarkably similar music.
The labels demonstrated that careful prompting could coax these systems into producing tracks resembling iconic songs. Suno allegedly generated music containing recognisable elements from Chuck Berry’s “Johnny B. Goode,” B.B. King’s “The Thrill Is Gone,” and James Brown’s “I Got You (I Feel Good).” Udio reportedly created outputs with striking similarities to Michael Jackson’s “Billie Jean,” ABBA’s “Dancing Queen,” and Mariah Carey’s “All I Want For Christmas Is You.” The companies could not have built models producing such similar audio without initially copying those recordings, the labels argue.
In August 2024, the major labels sued Suno and Udio, alleging that both companies trained on vast catalogues of copyrighted recordings without authorisation and that the services can generate outputs that emulate protected expression. [4] Both companies have broadly maintained that training is lawful, while disputing the labels’ characterisation of the technology as a substitute for recorded music. The cases have since become focal points for the question of whether music – because it is routinely consumed in fixed, replayable form – will be treated differently from books or images in the generative‑AI fair use analysis.
Music publishers separately pursued Anthropic in Concord Music Group v. Anthropic, which filed in October 2023. [5] Eight publishers including Universal Music and ABKCO alleged that Anthropic’s Claude could reproduce copyrighted song lyrics when prompted – sometimes entire verses verbatim. Unlike generative claims about training, these allegations focused on output: Claude allegedly generated content that directly reproduced protected works. The case continues, with litigation ongoing regarding both training practices and Claude’s ability to reproduce lyrics.
The litigation landscape transformed dramatically in late 2025. Universal Music Group and Udio announced a settlement and strategic agreement on October 29, 2025, followed by Warner Music Group’s settlement with Udio on November 19, 2025 and its settlement and partnership with Suno on November 25, 2025. [4] Each arrangement signals a pivot from pure litigation toward licensing‑style frameworks. The companies have indicated that new services and models are expected to launch in 2026 trained on licensed and authorised content, with current disputed models to be retired or materially re‑worked. The financial terms have largely not been publicly disclosed.
These settlements do not resolve all claims or all claimants. Sony Music’s actions against Suno remain active, and other disputes continue, including class actions brought by independent artists in Illinois in October 2025 alleging “stream‑ripping” and other unlicensed acquisition of copyrighted recordings. [4] In parallel, European collecting societies have also pursued enforcement strategies. GEMA filed proceedings against Suno in the Munich Regional Court in January 2025 and, in November 2025, obtained its landmark ruling against OpenAI concerning the reproduction of song lyrics by ChatGPT. [6] The combined effect is a bifurcated landscape – major labels increasingly negotiating structured deals, while independents and collecting societies continue to litigate – and, for AI music developers, a clear message that “licensed‑only” training is becoming the expected operating model.
Music cases highlight how different creative industries face distinct challenges. Text can be paraphrased and summarised; musical elements like melody, rhythm, and distinctive vocal styling are harder to transform beyond recognition. When AI produces audio that sounds remarkably similar to copyrighted recordings, the copying becomes aurally obvious in ways that text transformation might obscure. These cases will test whether courts apply different fair use standards for different creative media.
Images and the Compressed Copy Theory
Visual artists were among the first to challenge AI training in court. [7] Sarah Andersen, Kelly McKernan, and Karla Ortiz filed Andersen v. Stability AI in January 2023, targeting image generation systems trained on billions of scraped images. Their initial complaint faced scepticism from Judge William Orrick, who dismissed significant portions in October 2023, questioning whether they could prove infringement when AI outputs don’t reproduce specific training images exactly.
Everything changed in August 2024 when the artists refined their theory. They introduced the “compressed copy” concept namely that AI model parameters don’t merely learn abstract statistical patterns but rather store encoded representations of training data. When models reliably generate images matching specific artists’ styles, those model weights must contain transformed versions of copyrighted works. Judge Orrick found this theory plausible enough to proceed toward a September 2026 trial, creating potential liability for every stage: training models, distributing them, and using them.
The compressed copy theory may represent a fundamental challenge to AI companies’ technical self-understanding. Engineers describe training as statistical pattern extraction without storing retrievable copies – more like learning correlations than recording data. But if courts accept that model parameters themselves constitute copies, the entire foundation of modern machine learning faces legal jeopardy. The theory’s implications extend beyond images: if image models contain compressed copies, what about language models trained on text?
Meanwhile, Getty Images pursued a parallel strategy, suing Stability AI in both US and English courts in early 2023. [8] The English case advanced more rapidly, reaching trial in June 2025 and judgment on November 4, 2025. Getty abandoned its primary English copyright infringement claims mid‑trial, focusing instead on secondary infringement under English law. The High Court rejected Getty’s argument, ruling that AI model weights in diffusion models like Stability AI’s Stable Diffusion are not copies of training images. “The model weights are not themselves an infringing copy and they do not store an infringing copy,” Mrs. Justice Smith found that inference “does not require the use of any training data and the model itself does not store training data.” This directly contrasts with the compressed‑copy theory gaining traction in some US pleadings, illustrating how different legal traditions can produce materially different results when confronted with identical technology.[9]
Getty secured limited trademark victories in the UK ruling. The court found that earlier Stable Diffusion versions had infringed Getty’s trademarks by generating outputs with distorted watermarks, though these violations affected only outdated software that had already been superseded. Mrs. Justice Smith added an obiter observation: if she was wrong about the weights not being copies, Stability AI would be liable because staff knew works were scraped without consent and discussed removing watermarks from training data. Both sides claimed victory, with Getty emphasising trademark findings while Stability AI celebrated the copyright rejection. The judge cautioned that her ruling applies specifically to diffusion models and that “if their models actually keep works in their memory,” other AI companies “could be infringing under English copyright law.” Getty’s American case continues in San Francisco, where different fair use doctrine may produce entirely different outcomes.
A Landmark German AI Copyright Ruling: GEMA v. OpenAI

On 11 November 2025, the 42nd Civil Chamber of the Munich I Regional Court (Landgericht München I) largely upheld GEMA’s claims against two OpenAI group companies (Az. 42 O 14139/24), granting injunctive relief and ordering the provision of information and damages in relation to the alleged use and reproduction of protected song lyrics. The ruling concerned lyrics from nine well‑known German songs. [6]
According to the court’s press summary, infringement was found on two levels: (i) reproduction of the relevant lyrics within the language models (described as “memorisation” embodied in model parameters) and (ii) reproduction/making available of the lyrics in ChatGPT outputs generated in response to simple user prompts. The court rejected arguments that liability lay only with end‑users and held that these acts were not excused by copyright limitations, including the EU text‑and‑data‑mining exception. [6]
The court dismissed GEMA’s additional claim based on a violation of general personality rights arising from the incorrect attribution of modified lyrics. Even so, the decision is an important signal that—at least on the court’s current analysis – developers and operators can face direct exposure where models reproduce protected expressive content, and where licensing/opt‑out and output‑controls are not robust. [6]
The ruling sharpens an emerging divergence across jurisdictions. UK courts (in Getty Images v Stability AI) have rejected the proposition that model weights are themselves infringing copies; Munich, by contrast, treated memorisation embodied in model parameters as capable of satisfying the “embodiment” requirement for reproduction under German law. In the US, fair-use doctrine is evolving along a different path, centred on transformativeness and market substitution. [9][6]
OpenAI has stated that it disagrees with the ruling and is considering next steps; the decision can be appealed. The case is therefore likely to remain in flux, but the direction of travel is clear: systems that can output near‑verbatim protected lyrics will face growing pressure to move to licensed datasets and tighter output controls. [6]
Publishers Fight Back: Traditional Media’s Counter-Offensive
The New York Times elevated the stakes when it sued OpenAI and Microsoft in December 2023 [10], bringing institutional credibility and resources to challenge AI training. The newspaper advanced a sophisticated multi-layered theory spanning training, memorisation, and market substitution. Training involved literal copying – transferring Times articles from Times servers to OpenAI infrastructure. Memorisation occurred when ChatGPT sometimes reproduced substantial article portions verbatim when prompted strategically. Market substitution happened when users accessed Times reporting through ChatGPT instead of subscribing.
The procedural progression revealed judicial seriousness about these claims. Judge Sidney Stein rejected OpenAI’s motion to dismiss on April 4, 2025, allowing primary copyright claims to advance. Then in May 2025, Magistrate Judge Ona T. Wang issued a preservation order requiring OpenAI to retain all ChatGPT conversation logs affecting over 400 million users globally. When OpenAI challenged this burden, Judge Stein upheld the order, demonstrating willingness to impose substantial discovery requirements.
The scale of this discovery highlights unprecedented challenges. Traditional copyright disputes might involve thousands of documents; here, relevant evidence potentially encompasses billions of user interactions, raising complex questions about privacy, privilege, and trade secrets. Yet the courts appear willing to mandate this level of discovery, recognising that without it, AI companies could exploit technical complexity to shield themselves from scrutiny.
A different challenge emerged with Dow Jones & Company and the New York Post’s lawsuit against Perplexity AI, filed on October 21, 2024. Unlike traditional AI training cases, this focused on “retrieval-augmented generation” (‘RAG’)—RAG technology that combines pre-trained models with real-time database queries. Perplexity allegedly scrapes news content into RAG databases, allowing users to bypass publishers’ websites entirely. The company marketed this as “Skip the links”, which the plaintiffs characterised as a brazen admission of market substitution intent.[11]
Perplexity defended its approach vigorously, arguing that AI-enhanced search represents transformative technology that benefits users by efficiently delivering information. The company claimed publishers wish this technology didn’t exist because they would prefer a world where “publicly reported facts are owned by corporations.” However, the plaintiffs distinguished Perplexity from the search engines stating that a traditional search enables the discovery of their work, while Perplexity provides substitutes for it.
RAG technology presents distinct legal questions from training-based models. Traditional AI training involves one-time copying into model parameters; RAG involves ongoing scraping and database maintenance. The copying is more direct, continuous, and obviously serves identical purposes as the original journalism. If the courts find that training qualifies as fair use but RAG does not, AI companies will face complex technical and legal choices about their system architecture.
Regulatory Frameworks: When Legislatures Step In
While courts battle with questions of doctrine, regulators worldwide construct frameworks that may render moot some of the legal debates. The relevant sections of the European Union’s AI Act comes into force on August 2, 2026, and will require AI developers to disclose detailed training data information with penalties reaching €15 million or 3% of global revenue. [12] California’s AB 2013 imposes similar disclosure requirements beginning in January 2026. Combined, these create de facto global standards since major AI companies serve both markets and cannot practically maintain different disclosure practices for different jurisdictions.[13]
The United Kingdom is pursuing a different path. In 2024, the UK explored opt-out mechanisms allowing creators to exclude their work from AI training. This approach acknowledges creators’ concerns while preserving AI companies’ ability to train on non-excluded material without licensing. Whether this balances competing interests or simply shifts burdens to creators remains contested, but it represents an alternative to Europe’s mandatory disclosure model.
Market-based solutions have emerged alongside litigation and regulation. Some AI companies now pursue direct licensing deals with major publishers and content providers. OpenAI signed agreements with the Associated Press and Axel Springer. Perplexity launched a Publisher Program in July 2024 sharing revenue with participating content sources. These voluntary arrangements suggest possible paths forward if litigation creates sufficient pressure.
Technical infrastructure for creator control develops rapidly. DeviantArt implemented opt-out systems in November 2022, allowing artists to exclude work from training datasets. Spawning’s ‘Have I Been Trained’ platform has facilitated over 80 million artwork opt-outs. The Coalition for Content Provenance and Authenticity (‘C2PA’), founded in 2021, expects ISO standardisation soon for technical standards allowing creators to embed machine-readable licensing preferences directly into digital files.
Whether these voluntary frameworks achieve critical mass adoption depends largely on litigation outcomes. If courts consistently rule AI training constitutes fair use, licensing incentives diminish. If courts find against AI companies, licensing becomes necessary for legal operation. The Anthropic settlement suggests a middle path. Companies may choose to pay rather than litigate even when fair use arguments have merit, valuing certainty over risk.
The Philosophical Question: Who Should Pay for Progress?

Beneath technical legal debate lies a fundamental philosophical question about the proper costs of innovation and who shoulders the risk during legal uncertainty. One perspective characterises AI training as analogous to human learning – reading books to acquire knowledge, studying art to understand techniques, examining code to grasp algorithms. These activities have never required the licensing of every studied work. Demanding that for AI training would impose prohibitive costs that would prevent the emergence of beneficial technology. Creators have no greater claim on AI training data than authors have on readers who learn from their books, this view holds.
The opposing perspective characterises AI training as industrial-scale commercial exploitation, not individual education. Companies aren’t reading to understand human culture – they’re ingesting millions of works to construct profit-generating products. Scale, purpose, and the commercial nature fundamentally distinguish this from the context of traditional fair use. Permitting unconstrained copying creates what critics term “content kleptocracy,” where a creator’s life work becomes the raw material for corporate profit, without consent or compensation.
Both narratives contain elements of validity. AI models genuinely transform training data into novel capabilities, assisting with tasks and generating content that did not exist in the source material. But they also depend entirely on that training data, and commercial success is derived directly from millions of creators’ unpaid labour. The transformation is authentic; so is the appropriation.
The Anthropic settlement reframes this debate by demonstrating that compensation is financially viable. The company paid $1.5 billion – substantial but not existential for a firm that just closed a $13 billion funding round valuing it at $183 billion. “This settlement marks the beginning of a necessary evolution toward a legitimate, market-based licensing scheme for training data,” observed tech industry lawyer, Cecilia Ziniti. “It’s not the end of AI, but the start of a more mature, sustainable ecosystem where creators are compensated.”[14]
Current practice places the costs of uncertainty overwhelmingly on the creators, who watch their work fuel AI systems, without compensation or control, unless and until they win lawsuits or secure regulatory protection. An alternative would place those costs on AI companies: if you want industrial-scale training on copyrighted works, obtain licences or accept liability risk should courts later determine you should have done so. No neutral ground exists here. Any legal framework, or its absence, allocates risks and rewards among parties with divergent interests.
The fundamental tension persists regardless of judicial rulings, settlements, or regulatory actions. Technology enabling machines to learn from human creativity offers genuine benefits alongside genuine harms. Creators’ rights to control and profit from their work rest on centuries of legal tradition and international treaty obligations. We are negotiating boundaries between legitimate but competing claims, and negotiations have barely begun.
The next two to three years should prove critical. Major trials and appellate inflection points loom: the remaining issues in the US book‑training cases; continued motion practice in The New York Times v. OpenAI; the progression of Getty’s re‑filed US claims in Northern California; and, in the visual‑artist litigation, Andersen v. Stability AI, which is currently on a schedule pointing toward 2027. In Europe, the GEMA v. OpenAI decision has reframed the risk analysis for any model capable of reproducing protected text on demand, while the EU AI Act’s transparency and governance obligations begin to bite on a staged timetable. These developments will determine not merely who profits from AI, but what we believe creativity is worth, and what obligations we owe those whose expression trained the machines now augmenting or replacing them.
What emerges will shape not just technology law but our entire creative culture. The Anthropic settlement suggests one possible future where compensation flows alongside innovation. The judicial split suggests another where geography determines legality. The aggressive litigation suggests yet another where uncertainty paralyses development. Which future we inhabit depends on choices being made right now in courtrooms, legislative chambers, and corporate boardrooms – choices that will echo through decades of creative and technological development.
NOTES
[1] Bartz v. Anthropic PBC, No. 3:24-cv-05417 (N.D. Cal.) (order on fair use June 23, 2025); settlement announced Aug. 26, 2025; preliminarily approved Sept. 25, 2025.
[2] Kadrey v. Meta Platforms, Inc., No. 3:23-cv-03417 (N.D. Cal. June 25, 2025).
[3] Thomson Reuters Enter. Ctr. GmbH v. ROSS Intel. Inc., 765 F. Supp. 3d 382 (D. Del. 2025).
[4] UMG Recordings, Inc. v. Uncharted Labs, Inc., No. 1:24-cv-04777 (S.D.N.Y. filed June 24, 2024), settled Oct. 29, 2025; Warner Music Group settled with Udio (Nov. 19, 2025) and Suno (Nov. 25, 2025); Sony Music Entm’t v. Suno, Inc., No. 1:24-cv-11611 (D. Mass. filed June 24, 2024) (pending as of Dec. 2025). See also GEMA, “Suno AI and Open AI: GEMA sues for fair compensation” (stating proceedings against Suno in Munich Regional Court in Jan. 2025).
[5] Concord Music Grp., Inc. v. Anthropic PBC, No. 3:23-cv-01092 (M.D. Tenn. filed Oct. 18, 2023), transferred to No. 5:24-cv-03811 (N.D. Cal.).
[6] GEMA v. OpenAI, Az. 42 O 14139/24 (Munich I Regional Court / Landgericht München I, 11 Nov. 2025) (press summary); see Landgericht München I press release “Urteil GEMA gegen Open AI” (11 Nov. 2025) and the unofficial English translation circulated by IFRRO (press release translation dated 11 Nov. 2025). Practitioner commentary: Bird & Bird, “Landmark ruling of the Munich Regional Court (GEMA v OpenAI) on copyright and AI training” (14 Nov. 2025). Reporting on appeal posture: Reuters, “OpenAI used song lyrics in violation of copyright laws, German court says” (11 Nov. 2025) (OpenAI “considering next steps”; decision appealable). Chronology of proceedings: GEMA, “Suno AI and Open AI: GEMA sues for fair compensation” (stating proceedings against OpenAI filed in Nov. 2024; proceedings against Suno in Munich).
[7] Andersen v. Stability AI Ltd., No. 3:23-cv-00201 (N.D. Cal. filed Jan. 13, 2023); Order Granting in Part Motion to Dismiss, Oct. 30, 2023.
[8] Getty Images (US), Inc. v. Stability AI, Inc., No. 1:23-cv-00135 (D. Del. filed Feb. 3, 2023); voluntarily dismissed and refiled as Getty Images (US), Inc. v. Stability AI, Ltd., No. 3:25-cv-06891 (N.D. Cal. filed Aug. 14, 2025).
[9] Getty Images (US), Inc. v. Stability AI Ltd., [2025] EWHC 2863 (Ch) (Nov. 4, 2025).
[10] The New York Times Co. v. OpenAI, Inc., No. 1:23-cv-11195 (S.D.N.Y. filed Dec. 27, 2023); Order Denying Motion to Dismiss, Apr. 4, 2025.
[11] Dow Jones & Co. v. Perplexity AI Inc., No. 1:24-cv-07984 (S.D.N.Y. filed Oct. 21, 2024).
[12] Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (AI Act), O.J. (L) 2024/1689, entered into force Aug. 1, 2024, full application Aug. 2, 2026.
[13] Cal. A.B. 2013, Artificial Intelligence Training Data Transparency Act (2024), effective Jan. 1, 2026.
[14] https://www.yahoo.com/news/articles/anthropic-reaches-1-5-billion-221849839.html


No responses yet