OpenAI Faces Legal Scrutiny Over Deletion of Allegedly Pirated Book Datasets

This article was generated by AI and cites original sources.

OpenAI, a prominent player in the AI landscape, is facing legal pressure following the deletion of book datasets that have sparked controversy. The datasets, known as ‘Books 1’ and ‘Books 2,’ were removed before the release of ChatGPT in 2022. These datasets, allegedly sourced from Library Genesis (LibGen), have put OpenAI in the crosshairs of a class-action lawsuit from authors who claim their works were used without permission.

While OpenAI initially cited ‘non-use’ as a rationale for deleting the datasets, subsequent legal developments have raised questions about the true motives behind this action. Authors have pushed for transparency, leading to a court order for OpenAI to disclose internal communications related to the dataset deletion, including discussions with in-house lawyers and references to LibGen that were previously withheld under attorney-client privilege.

This legal saga underscores the complexities of data ethics and intellectual property rights in the realm of artificial intelligence. As AI models become more sophisticated and data-intensive, ensuring ethical sourcing and usage of datasets is paramount to prevent legal entanglements and safeguard intellectual property.

Source: Ars Technica

WAYR TODAY

OpenAI Faces Legal Scrutiny Over Deletion of Allegedly Pirated Book Datasets

More posts

Anthropic Acquires SDK Startup Stainless, Cutting Off Access for OpenAI and Google

Jury Rules Against Elon Musk in OpenAI Lawsuit, Finding Claims Filed Too Late

Kin Health Raises $9M to Build AI Notetaker for Patients Visiting Doctors

Amazon Alexa Plus Now Generates AI Podcasts on User-Chosen Topics