Publishers and bestselling author sue Meta for Llama AI training on copyrighted works

Five major publishers and author Scott Turow filed a class-action lawsuit against Meta and CEO Mark Zuckerberg, alleging the company pirated millions of copyrighted books and journals to train Llama.

Objective Facts

On May 5, 2026, publishing houses Hachette, Macmillan, McGraw Hill, Elsevier and Cengage, along with bestselling author Scott Turow, filed a class-action lawsuit against Meta and CEO Mark Zuckerberg. The plaintiffs argue Meta knowingly copied copyrighted materials from notorious pirate websites such as LibGen and Anna's Archive to train various iterations of its Llama language model — with Zuckerberg's personal authorization to do so. The complaint alleges the company pirated more than 267 terabytes of copyrighted books and journal articles to train its Llama AI models. Meta discussed increasing the company's "dataset licensing" budget to as much as $200 million from January to April 2023, but in early April 2023, "Meta abruptly stopped its licensing strategy," after which Meta's business development team received verbal instructions to stop licensing efforts. Nkechi Nneji, a public affairs director for Meta, responded: "AI is powering transformative innovations, productivity and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted material can qualify as fair use." This lawsuit comes after in June 2025, Judge Vince Chhabria granted summary judgment for Meta on fair-use grounds in an earlier author lawsuit, but the publishers argue their case presents stronger evidence of willful infringement through documented decision-making.

Left-Leaning Perspective

Publishers and author advocacy groups framed this lawsuit as a response to what they characterize as blatant corporate theft. Authors Guild CEO Mary Rasenberger called it "the most flagrant copyright breach in history" and stated "these voracious tech companies need to be held accountable." The Association of American Publishers argued that Meta made "calculated decisions to enrich itself with literary properties that it did not create and does not own," and stated that "Meta's mass-scale infringement isn't public progress, and AI will never be properly realized if tech companies prioritize pirate sites over scholarship and imagination." The publishers emphasized that "strong intellectual property protections are fundamental to the innovation that makes this progress possible. As AI evolves, it must be developed and deployed in ways that respect and uphold these protections." The left-leaning analysis emphasized Meta's deliberate choice to abandon licensing negotiations in favor of piracy. Commentary noted that "the fair use rulings Meta cites were issued before any court confronted documented evidence of a deliberate choice to pirate over license," and that "Anthropic found out what that distinction costs: after a judge ruled its piracy was not fair use, it agreed to a $1.5 billion settlement with authors." Turow argued: "All Americans should understand that the bold future promised by A.I., has been, to paraphrase the investigative writer Alex Reisner, created with stolen words. It is all the more shameful that these violations of the law were undertaken by one of the richest corporations in the world." The left's coverage emphasizes corporate accountability for creators and intellectual property protection as foundational to innovation, with particular focus on Meta's internal documents showing a deliberate strategic choice to pirate rather than license.

Right-Leaning Perspective

Meta and tech-industry supporters defended the company's use of copyrighted material as transformative fair use, pointing to judicial precedent. Meta's Nkechi Nneji stated: "AI is powering transformative innovations, productivity and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted material can qualify as fair use." Meta maintained: "AI is powering transformative innovations, productivity and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted material can qualify as fair use." This defense reflects a broader industry stance that "innovation should not be stifled by overly restrictive interpretations of copyright law." The right emphasized that courts have already addressed this question favorably for tech companies. Judge Vincent Chhabria ruled in June 2025 that Meta had engaged in "fair use" when it used a data set of nearly 200,000 books to train its Llama language model for generative AI. Industry commentary noted that "the tech industry has operated under the assumption that training AI on publicly available or widely distributed content may qualify as fair use, especially if the resulting output is transformative." Some analysis framed the core dispute as a doctrinal question: "The outcome of these lawsuits will likely hinge on how judges interpret the concept of fair use in the context of machine learning. Traditionally, fair use allows limited use of copyrighted material without permission for purposes such as criticism, education, or parody. But AI does not fit neatly into these categories. It does not quote or critique in the traditional sense; it learns, adapts, and generates new content based on patterns extracted from existing works." Right-leaning defenders of Meta emphasize that the technology creates genuine value and transformation, and that courts have already vindicated this approach.

Deep Dive

This lawsuit is the latest in a long line of AI-training copyright cases, but it is meaningfully different from most of those that have come before. Where the earlier Kadrey case involved roughly 666 specific books from a small group of individual authors, the new complaint covers the entire publishing operations of five companies that together account for a substantial share of the world's academic, educational, and trade publishing output. Titles include not only literary works such as N.K. Jemisin's "The Fifth Season" and Peter Brown's "The Wild Robot" but also textbooks, scientific journal articles, and reference works. The market for those works, particularly the academic and educational categories, is structurally different from the trade-fiction market that dominated the Kadrey plaintiff set. Judge Vince Chhabria's earlier ruling on fair use "was unusually narrow, and unusually candid about its limits. He said publicly that Meta's win 'may be in significant tension with reality' and that the ruling applied only to the specific authors who had brought the case." The new lawsuit addresses what courts said was missing from earlier cases: documented evidence of willful intent to circumvent licensing. The fair use rulings Meta cites were issued before any court confronted documented evidence of a deliberate choice to pirate over license. The complaint documents that Meta considered increasing its dataset licensing budget from $17 million to $200 million before Zuckerberg authorized a shift to piracy — giving plaintiffs the evidence of willful infringement that earlier lawsuits lacked. This paper trail is significant because it distinguishes this case from the Kadrey ruling, which Meta won on fair-use grounds despite the judge acknowledging concerns about market harm. Anthropic, meanwhile, found out what willful piracy costs: after a judge ruled its piracy was not fair use, it agreed to a $1.5 billion settlement with authors in a case known as Bartz v. Anthropic. That settlement awaits final court approval on May 14, 2026. This creates a critical question for the Meta case: whether the presence of internal documents showing deliberate avoidance of licensing will push a court to rule differently on fair use than judges did in earlier AI copyright cases. The procedural calendar will move slowly. Class certification, motions to dismiss, summary-judgment briefing, and trial scheduling will, in the ordinary course, take 18 to 24 months. The outcome will have implications not only for Meta but for how copyright law adapts to AI development across the industry.

Publishers and bestselling author sue Meta for Llama AI training on copyrighted works

Five major publishers and author Scott Turow filed a class-action lawsuit against Meta and CEO Mark Zuckerberg, alleging the company pirated millions of copyrighted books and journals to train Llama.

May 5, 2026· Updated May 6, 2026

What's Going On

Left says: Authors Guild CEO Mary Rasenberger called it "the most flagrant copyright breach in history," arguing that major tech companies deliberately circumvent licensing markets to exploit creative workers without compensation.

Right says: Meta argues that courts have already found AI training on copyrighted material can constitute fair use, and that innovation should not be stifled by restrictive copyright interpretations.

✓ Common Ground

Multiple voices across the spectrum acknowledge that courts have issued divergent rulings on this question, with all pending cases likely revolving around whether AI systems make fair use of copyrighted material, with "the first two judges to consider the matter" having "issued diverging rulings last year."

Both sides recognize that "the new lawsuit adds pressure at a time when courts are still sorting out how copyright law applies to AI," and that "creators across publishing, news, and the arts have taken aim at companies such as Meta, OpenAI, and Anthropic."

Some observers on both sides acknowledge that Anthropic's $1.5 billion settlement with authors represents a precedent and financial marker for how courts may value copyright claims in AI training cases.

The Next Web - Five major publishers are suing Meta over Llama Reuters via PYMNTS - Publishers Accuse Meta of Misusing Their Works in AI Training Washington Post - Publishers sue Meta, claiming it violated copyrights in training AI Fortune - James Patterson, Biden publishers say Mark Zuckerberg 'personally authorized' copyright infringement

Objective Deep Dive

The new lawsuit addresses what courts said was missing from earlier cases: documented evidence of willful intent to circumvent licensing. The fair use rulings Meta cites were issued before any court confronted documented evidence of a deliberate choice to pirate over license. The complaint documents that Meta considered increasing its dataset licensing budget from $17 million to $200 million before Zuckerberg authorized a shift to piracy — giving plaintiffs the evidence of willful infringement that earlier lawsuits lacked. This paper trail is significant because it distinguishes this case from the Kadrey ruling, which Meta won on fair-use grounds despite the judge acknowledging concerns about market harm.

Anthropic, meanwhile, found out what willful piracy costs: after a judge ruled its piracy was not fair use, it agreed to a $1.5 billion settlement with authors in a case known as Bartz v. Anthropic. That settlement awaits final court approval on May 14, 2026. This creates a critical question for the Meta case: whether the presence of internal documents showing deliberate avoidance of licensing will push a court to rule differently on fair use than judges did in earlier AI copyright cases. The procedural calendar will move slowly. Class certification, motions to dismiss, summary-judgment briefing, and trial scheduling will, in the ordinary course, take 18 to 24 months. The outcome will have implications not only for Meta but for how copyright law adapts to AI development across the industry.

◈ Tone Comparison

Publishers used stark moral language, with Mary Rasenberger calling the conduct "the most flagrant copyright breach in history," and Turow invoking "stolen words" to frame the issue as one of corporate theft. Meta's tone, by contrast, emphasizes positive societal benefits, with the company highlighting how "AI is powering transformative innovations, productivity and creativity." Publishers frame IP protection as foundational to democracy and creativity; Meta frames it as a potential brake on beneficial innovation.