Anthropic’s Landmark Settlement: A $1.5 Billion Copyright Precedent in Artificial Intelligence Training Data

2025-09-22

In the summer of 2025, while Silicon Valley’s artificial-intelligence companies were celebrating their latest breakthroughs in machine learning, a different kind of calculation was taking place in a federal courthouse in San Francisco. Anthropic PBC, the creator of the Claude language models, was quietly negotiating what would become the most expensive lesson in copyright law that the tech industry had ever received: a $1.5 billion settlement that dwarfs every previous copyright recovery in American legal history.

The sum is staggering enough to make even venture capitalists pause. But the true significance of Anthropic’s agreement lies not in its size – though at roughly $3,000 per pirated book, it represents a reckoning that authors and publishers could hardly have imagined – but in what it reveals about the shadow economy that has fueled the AI revolution. For years, the brightest minds in technology have been training their algorithms on vast libraries of human knowledge, much of it obtained through what polite company might call “unconventional means”. The Anthropic settlement suggests that this particular party may be coming to an end.

The Genesis of Industry-Altering Litigation

The litigation originated in August 2024 when authors Andrea Bartz, Charles Graeber, and MJ + KJ Inc. initiated proceedings against Anthropic, alleging unauthorized exploitation of their copyrighted works. The gravamen of the complaint transcended conventional copying allegations – plaintiffs contended that Anthropic systematically harvested literary works from notorious piracy repositories, Library Genesis and Pirate Library Mirror, subsequently utilizing these materials for large language model training without authorization.

The Shadow Library Ecosystem

Library Genesis (LibGen) and Pirate Library Mirror (PiLiMi) represent two of the internet’s most extensive unauthorized repositories of scholarly publications and literary works, commonly designated as “shadow libraries”. LibGen, operational since 2008 as a successor to earlier Russian digital library initiatives, aggregates millions of books, academic articles, and scholarly journals, providing unrestricted access notwithstanding copyright protection. The platform characterizes itself merely as a “link aggregator” for materials “collected from publicly available internet resources”. PiLiMi constitutes a distinct platform with analogous functionality, though maintaining separate infrastructure and an independent database. Both services operate within legal gray areas, employing distributed server architectures and BitTorrent protocols for content distribution. In Anthropic’s case, the company not only downloaded individual files but systematically “torrented” entire collections through what court documents characterized as “seeding” and “leeching” – processes characteristic of peer-to-peer networks where users simultaneously download and distribute files. While platform creators argue they democratize knowledge access, particularly in developing nations, from a legal standpoint they constitute massive copyright infringement enterprises.

The scale of the alleged infringement was extraordinary. Court documents reveal the operational magnitude: approximately seven million downloaded files, from which roughly 500,000 books were ultimately identified as meeting class action criteria. Each of these works possessed registered copyright protection and was obtained from piracy sources.

Anatomy of a Record-Breaking Settlement

The settlement, achieved following intensive negotiations conducted under the supervision of retired Judge Layn R. Phillips, establishes an irrevocable fund valued at no less than $1.5 billion. The financial arrangement contemplates a twenty-four-month disbursement schedule with accruing interest benefiting the injured parties.

The payment structure proceeds as follows:

$300 million within five business days of preliminary approval;
$300 million following final approval;
$450 million at twelve months;
$450 million at twenty-four months.

The mathematical implications prove remarkable: assuming 500,000 works within the settlement class, each book will receive approximately $3,000. This amount represents four times the minimum statutory damages of $750 and fifteen times the $200 minimum for “innocent infringement” cases.

Beyond Monetary Compensation

The agreement extends beyond financial restitution. Anthropic has committed to destroying all original files obtained from Library Genesis and Pirate Library Mirror, together with any derivative copies originating from these sources. The company must complete this destruction within thirty days of final judgment and provide written certification of compliance to class counsel.

Critically, the settlement’s scope limitations merit emphasis. The liability release applies exclusively to past conduct through August 25, 2025 – the date of the preliminary agreement. Future violations remain beyond the settlement’s purview. Moreover, the agreement entirely excludes claims related to AI-generated content, preserving this arena for future litigation.

Judge Alsup and Precedential Jurisprudence

The proceedings unfolded before Judge William Alsup, renowned for technologically sophisticated rulings and unusual programming expertise for a jurist. Alsup previously presided over Oracle America, Inc. v. Google, Inc. during 2012-2016, concerning Java API interfaces and the Android system – litigation that defined copyright boundaries in the software domain. During those proceedings, the judge gained notoriety for directly critiquing Oracle’s counsel, including prominent attorney David Boies, whom he chastised for characterizing the rangeCheck function as innovative when he himself had “written similar code blocks hundreds of times”.

Media coverage extensively documented Alsup’s acquisition of Java programming knowledge specifically for the case, though subsequent reporting revealed he primarily leveraged his extensive experience as a hobbyist BASIC programmer. His determination that Java APIs were not subject to copyright protection was reversed by the Federal Circuit Court of Appeals, but in 2021 the Supreme Court ultimately recognized Google’s API usage as fair use.

Alsup also presided over the first judicial proceedings challenging the American “No Fly List” – a registry of individuals prohibited from commercial aviation. His ruling criticized governmental appeals procedures as inadequate under due process standards. In 2020, he sentenced Anthony Levandowski to eighteen months’ imprisonment for trade secret theft from Google’s Waymo division for the Otto startup, subsequently sold to Uber for $680 million.

In June 2025, Judge Alsup issued a partial summary judgment order that proved pivotal to the case’s trajectory. The court determined that utilizing lawfully purchased and scanned books for AI training might constitute fair use, but categorically rejected this defense regarding materials obtained from piracy sources.

“The utilization of materials from piracy sources is inherently, irremediably infringing”, the court concluded. This distinction between legitimate and piracy-sourced data may possess fundamental significance for the entire AI industry.

Risk Calculus Compelling Settlement

What factors motivated Anthropic’s agreement to such substantial settlement terms? Court documents reveal a comprehensive risk catalog confronting the company. Despite partial success regarding fair use, Anthropic continued facing:

Appeals challenging class certification
Potentially adverse jury verdicts
Multi-year appellate proceedings
Additional lawsuits from other injured parties

For authors and publishers, the settlement similarly eliminated significant uncertainty. Notwithstanding favorable preliminary rulings, the litigation could have extended for years with uncertain outcomes.

Implications for Artificial Intelligence’s Future

This settlement may fundamentally transform AI companies’ approaches to training data acquisition. The distinction between legitimate and piracy sources, established by Judge Alsup, will likely be adopted by other courts adjudicating similar disputes.

Currently, dozens of comparable lawsuits proceed against major AI companies. The Anthropic settlement may pressure other firms toward similar agreements or, more probably, toward reformed training data acquisition practices.

The settlement, while groundbreaking, presents practical implementation challenges. Identifying rights holders for 500,000 books and ensuring equitable fund distribution represents a logistical undertaking of enormous scale. A specialized Author-Publisher Working Group has been established under Mary Rasenberger and Maria Pallante’s leadership to address competing claims to identical works.

Legal fees also merit consideration – potentially reaching 25% of the fund, or $375 million. This represents an astronomical sum even by American class action standards.

Conclusion: A New Era of Accountability

The Anthropic settlement concludes one chapter in AI’s legal evolution while inaugurating an entirely new paradigm. Technology companies can no longer disregard copyright protections under innovation pretenses. The era of “ask for forgiveness, not permission” has concluded, at least regarding piracy-sourced content.

Questions remain whether this settlement will genuinely deter other companies from similar practices or will merely be treated as a business cost. Given AI’s expanding revenues, even $1.5 billion may represent a price the industry accepts for accessing vast piracy content repositories.

One conclusion appears certain: precedent has been established, and subsequent litigation will proceed in its shadow. For creators and copyright holders, this may herald a new era of equitable compensation for their works’ utilization in artificial intelligence development.

Case Citations:

Bartz et al. v. Anthropic PBC, No. 3:24-cv-05417-WHA (N.D. Cal.)

Oracle America, Inc. v. Google, Inc., No. 3:10-cv-03561-WHA (N.D. Cal.)

Latif v. Holder, No. 3:10-cv-00750-WHA (N.D. Cal.) (No Fly List litigation)

United States v. Levandowski, No. 3:17-cr-00201-WHA (N.D. Cal.)

Robert Nogacki

Robert Nogacki – licensed legal counsel (radca prawny, WA-9026), Founder of Kancelaria Prawna Skarbiec.

There are lawyers who practice law. And there are those who deal with problems for which the law has no ready answer. For over twenty years, Kancelaria Skarbiec has worked at the intersection of tax law, corporate structures, and the deeply human reluctance to give the state more than the state is owed. We advise entrepreneurs from over a dozen countries – from those on the Forbes list to those whose bank account was just seized by the tax authority and who do not know what to do tomorrow morning.

One of the most frequently cited experts on tax law in Polish media – he writes for Rzeczpospolita, Dziennik Gazeta Prawna, and Parkiet not because it looks good on a résumé, but because certain things cannot be explained in a court filing and someone needs to say them out loud. Author of AI Decoding Satoshi Nakamoto: Artificial Intelligence on the Trail of Bitcoin’s Creator. Co-author of the award-winning book Bezpieczeństwo współczesnej firmy (Security of a Modern Company).

Kancelaria Skarbiec holds top positions in the tax law firm rankings of Dziennik Gazeta Prawna. Four-time winner of the European Medal, recipient of the title International Tax Planning Law Firm of the Year in Poland.

He specializes in tax disputes with fiscal authorities, international tax planning, crypto-asset regulation, and asset protection. Since 2006, he has led the WGI case – one of the longest-running criminal proceedings in the history of the Polish financial market – because there are things you do not leave half-done, even if they take two decades. He believes the law is too serious to be treated only seriously – and that the best legal advice is the kind that ensures the client never has to stand before a court.

Thematic publications

2025-11-12

The Psychology of Machine Manipulation: When...

Cisco’s recent security research reveals something unsettling about artificial intelligence: the same psychological manipulation techniques that bypass human judgment…