In what may prove a pivotal legal moment for the artificial intelligence industry, the New York Times has filed suit against OpenAI over the unauthorized use of years of journalistic content in the training of its generative AI models.
Like authors and artists have already done in the past year, the New York Times is taking legal action against OpenAI and its investor Microsoft over their use of the Times archive in training ChatGPT and its successive AI models.
The lawsuit was filed Wednesday after months in which the Times had reportedly been trying to negotiate a deal with OpenAI for the use of its content.
"Defendants seek to free-ride on The Times’s massive investment in its journalism," the complaint says, per the Times, and it accuses the companies of "using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it."
Without naming a number, the suit is seeking "billions of dollars in statutory and actual damages” relating to the “unlawful copying and use of The Times’s uniquely valuable works."
As the Associated Press reports, the Times' complaint cites examples in which GPT4, the successor to ChatGPT, and Microsoft's Bing chatbot not only summarized or paraphrased articles from the Times, but also just fully plagiarized them. One example given was "a Pulitzer-Prize winning investigation into New York City’s taxi industry that was published in 2019 and took 18 months to complete."
The lawsuit reportedly refers to talks that the Times has had with OpenAI going back to April, saying that the news organization wanted to "ensure it received fair value” for the AI's use of its archive, but that these talks have gone nowhere. The Times also says it seeks to "facilitate the continuation of a healthy news ecosystem, and help develop GenAI technology in a responsible way that benefits society and supports a well-informed public."
In addition to raising issues of copyright infringement, the suit also seeks to argue that real damage may be done to the Times brand through these chatbots — in particular when they spit out false information that they then attribute to the Times, of which the suit cites some examples.
Relatedly, the Associated Press reached a licensing agreement with OpenAI and Microsoft earlier this year, for the use of its content. And German publisher Axel Springer, which owns Politico and Business Insider, struck a similar licensing deal this month.
Some of the novel questions being raised with this new technology could end up at the Supreme Court in the coming terms, but first lower courts will have to rule on their merits. Multiple lawsuits have been filed by authors over the unauthorized "scraping" of their books, including the comedian Sarah Silverman — who joined a class-action lawsuit by writers in July. That suit and others contend that their work was fed into OpenAI's models in order that the chatbots can mimic them or co-opt their voices and styles, without their consent.
SF-based Anthropic, a competing AI company founded by some former OpenAI employees, is similarly being sued by Universal Music and other music publishers over the unauthorized use of their material.
The Times suit raises the possibility that chatbots could ultimately pose a threat to news organizations. As the Times explains, "When chatbots are asked about current events or other newsworthy topics, they can generate answers that rely on past journalism by The Times. The newspaper expresses concern that readers will be satisfied with a response from a chatbot and decline to visit The Times’s website, thus reducing web traffic that can be translated into advertising and subscription revenue."
Interestingly, the Times has retained the firm of Susman Godfrey to represent them in the suit, the same firm that successfully represented Dominion Voting Systems in its defamation suit against Fox News.
OpenAI has yet to respond to the suit, which was filed Wednesday in Federal District Court in New York.
Related: SF-Based AI Company Anthropic Sued by Music Publishers for Using Pop Stars’ Copyrighted Work
Photo: Stephan Valentin