OpenAI accidentally erased ChatGPT training findings as lawyers seek copyright violations

4 hours ago 2

The New York Times and Daily News have sued OpenAI and its investor Microsoft over suspicions that ChatGPT was trained on their copyrighted works. Now, it turns out, the lawyers’ research into the training data was erased last week by OpenAI engineers, presumably by accident.

NY Times lawyers had their potential evidence against OpenAI deleted

Kyle Wiggers writes for TechCrunch:

Earlier this fall, OpenAI agreed to provide two virtual machines so that counsel for The Times and Daily News could perform searches for their copyrighted content in its AI training sets…In a letter, attorneys for the publishers say that they and experts they hired have spent over 150 hours since November 1 searching OpenAI’s training data.

But on November 14, OpenAI engineers erased all the publishers’ search data stored on one of the virtual machines, according to the aforementioned letter, which was filed in the U.S. District Court for the Southern District of New York late Wednesday.

The aforementioned letter has been published online here for all to read.

It seems that after NY Times lawyers spent significant time compiling data from ChatGPT’s training set, their research was erased by OpenAI.

The letter states that OpenAI was later able to recover much of the data—but only in a form that makes it unusable in legal proceedings. Thus, it can’t be deployed against OpenAI in the case, and the expensive and time consuming work begins anew.

9to5Mac’s Take

The training data used by various AI companies unfortunately remains shrouded in a lot of vagueness. Not every publisher has the resources to pursue legal action against tech giants, but to then have your work accidentally deleted by OpenAI engineers? It’s a bad look, to say the least.

What do you make of this story? Let us know in the comments.

Best holiday gifts for Apple devices

FTC: We use income earning auto affiliate links. More.

Read Entire Article