Copyright AI growth has put Internet content behind paywalls - but OpenAI now on defense

Michael Wechsler

Administrator
Staff member
Jurisdiction
US Federal Law
The trillion dollar AI question that has been putting increasing numbers of free Internet content sites behind paywalls is "Can AI developers train their models on copyrighted works without permission under fair use? For decades our terms of use have forbid machines to make use of our content in the way that Artifical Intelligence LLMs (Large Language Models) are exploiting them by consuming our content and the providing answers directly to their visitors, based upon the knowledge shared here, without ever attributing or sending remuneration to the source of that knowledge. It's Grand Theft Autobot.

In re: OpenAI, Inc. Copyright Infringement Litigation (MDL, SDNY) is a consolidated legal action that includes over a dozen copyright plaintiffs (such as the New York Times newspaper), suing OpenAI for copyright infringement and a ruling that what OpenAI and other LLMs do does not fall under the Fair Use Exception to copyright. A recent ruling regarding electronic discovery has OpenAI in a tight space, requiried to disclose millions of ChatGPT logs to the copyright plaintiffs which relate to training the AI model without the permission of the copyright owners. As Jones Walker summarizes:

If plaintiffs demonstrate that ChatGPT routinely generates outputs that compete with or substitute for copyrighted content — even when users aren't specifically requesting plaintiffs' works — OpenAI's fair use defense becomes considerably harder to sustain. OpenAI Loses Privacy Gambit: 20 Million ChatGPT Logs Likely Headed to Copyright Plaintiffs

Earlier in the case, U.S. District Court for the Southern District of New York ruled that the consolidated class action of copyright holders adequately stated a prima facie case for copyright infringement, shutting down OpenAi's attempt to dismiss the action.

The argument of policy and law against finding OpenAI liable is that if they don't engage in what they are doing, other actors outside of the jurisdiction of the court will (such as China.) As such, all the court and legislature would be doing is picking winners that are against the interests of U.S. based companies - those winners being Chinese AI companies such as DeepSeek, Monshot AI, Zhipu AI, and other tech giants with deep pockets.
 
So Open AI's main argument is that it ought to be able to violate U.S. copyright law because the Chinese violate it? That's not a great argument. If the court does what it is supposed to do, which is make rulings based on the law, what the Chinese are doing shouldn't be a factor in the Court's decision. Courts decide cases and controversies involving the parties before them. Chinese AI firms are not a party to the case are outside the scope of the court's power. Policy is made by Congress. If Congress thinks some action needs to be taken to level the AI playing field it is in a better position to fashion the appropriate remedy. The issue for the Court is just whether OpenAI is violating the copyright law. That's all the judge should be focused upon. If OpenAI can't put forward a convincing argument that what it's doing is legal it should lose, regardless of what foreign competitors are going to do.
 
Good catch. I didn't include the reference to this article which explains the policy argument.


Providing "freedom-focused" recommendations on Trump's plan during a public comment period ending Saturday, OpenAI suggested Thursday that the US should end these court fights by shifting its copyright strategy to promote the AI industry's "freedom to learn." Otherwise, the People's Republic of China (PRC) will likely continue accessing copyrighted data that US companies cannot access, supposedly giving China a leg up "while gaining little in the way of protections for the original IP creators," OpenAI argued.

Whenever your back is against the wall, especially with this administration, use words and phrases that include "freedom" to justify that your liberating of someone else's property is for the greater good. It's not infringement. It's freedom to learn that is being restricted.

In fairness, the argument isn't completely without clever logic. If the property is inevitably going to be stolen anyway, at least the good guys will be able to police it and protect it better by having stolen it better first. Furthermore, when it comes to national security, we need our machine to be at least as good as our enemies.

Blockchain and crypto have created automated systems that cross borders and create incredibly difficult, complex problems for regulating and policing the flow of value and currency. Right at the same time you've got AI crossing borders and creating another extremely difficult enforcement challenge. We live in an interesting and challenging time.
 
Back
Top