Reddit sues Anthropic for allegedly using user data without consent to train AI models

Reddit sues Anthropic for allegedly using user data without consent to train AI models

Reddit has filed a lawsuit against Anthropic in a Northern California court, alleging the AI company used the personal data of Reddit users, including their deleted posts, to train its artificial intelligence models without user consent. In the suit, Reddit claims Anthropic’s actions violated the platform’s user agreement and bypassed its robots.txt protocol, which is designed to prevent automated data scraping of its content.

Reddit also asserts that Anthropic bots continued to access the platform more than 100,000 times, even after Anthropic said in 2024 it had blocked such activity. The company further argues this scraping compromised user privacy, as it involved content users intended to be deleted and no longer accessible.

While Reddit has signed licensing deals with OpenAI that govern data use under explicit privacy terms, it says Anthropic declined to negotiate a similar agreement. Notably, Sam Altman, CEO of OpenAI, owns 8.7% of Reddit and was previously on its board. Anthropic has publicly stated it disagrees with Reddit’s claims and intends to contest the case vigorously. Reddit is seeking compensatory damages, recovery of profits from the scraped data, and a court injunction to prevent further use of Reddit content.

by Mauricio B. Holguin

shazmataz
shazmataz found this interesting
Reddit iconReddit
  595
  • ...

Reddit is a social news platform where user-shared content is organized into topic-specific communities called subreddits. Content is voted on by users, influencing its visibility and potential appearance on the front page. Key features include a community-based structure, a built-in commenting system, and a voting mechanism. Reddit is rated 2.9 and has several alternatives for users seeking different experiences.

Comments

Shaz Shah
0

While Reddit has earned the reputation being a breading ground trolls, I did find some value by reading and learning from other's opinions for my research. But when they went public, it has gone further downhill. As it's mostly mostly for fun or sharing opinions, I don't see why it would be a good dataset for training AI, when clear and concise facts are needed. And didn't Reddit piss off it's users by messing around with APIs and screwing around with third-party clients?

xSalty1
0

jfc Just steal our data great.

RDF0909
4

Reddit is the last place I'd scrape to train an AI - it'd come out retarded.

Gaian Terra Canis Rufus
0

That’s why you go to Lemm.ee

Link is here: https://lemm.ee/

2 replies
Azazel

And what would that solve exactly? If AI bots aren't already scraping reddit alternatives, then it's just a matter of time

carotte

Lemmy is cool, but i wouldn't recommend the Lemmy instance that's shutting down at the end of the month as a good place to start lol

Gu