Reddit Sues Perplexity: The AI Data War In 2025

reddit sues perplexity

In a dramatic turn that exposes the growing friction between online platforms and artificial intelligence companies, Reddit sues Perplexity, accusing the AI-powered search startup of illegally scraping user posts to train its language models. Filed in a New York federal court, the lawsuit names Perplexity, along with three alleged partners—Lithuanian data firm Oxylabs, Texas-based SerpApi, and former Russian botnet AWMProxy—for “bypassing technical barriers” and disguising their data collection practices.

According to Reddit, these entities effectively built a shadow system that siphoned millions of user posts, turning community-driven discussions into unlicensed AI training data. Ben Lee, Reddit’s Chief Legal Officer, went as far as to call it an “industrial-scale data laundering operation,” accusing Perplexity and its partners of stealing user-generated content to fuel AI development without permission or compensation.


What Sparked Reddit’s Lawsuit Against Perplexity

Reddit’s stance isn’t against AI innovation—it’s against AI companies exploiting its ecosystem without consent. The platform has officially licensed its data to companies like OpenAI and Google, both of which signed structured agreements for ethical data usage. These deals not only bring revenue to Reddit but also establish a transparent framework where content creators indirectly benefit when their posts help train AI systems.

However, in the Reddit sues Perplexity case, things took a darker turn. The lawsuit alleges that Perplexity, with the help of its data vendors, used automated scrapers and proxy services to extract Reddit data directly from Google search results, bypassing rate limits and user protections.

To prove it, Reddit engineers reportedly created a “test post” designed only for Google indexing—never meant for real users. Within hours, Perplexity allegedly surfaced that exact post in its AI responses, confirming unauthorized scraping. Even after Reddit sent a cease-and-desist notice, mentions of Reddit content on Perplexity’s platform allegedly increased fortyfold, further strengthening the case.


How the Defendants Are Responding

Perplexity has firmly denied all allegations, labeling Reddit’s lawsuit as a form of “corporate extortion”. The company argues that it merely summarizes publicly available data, which it believes doesn’t fall under copyright restriction. Similarly, SerpApi and Oxylabs have pushed back, claiming their scraping methods operate within legal bounds, as the information is already accessible on the open web.

However, Reddit’s counterargument is straightforward: public visibility doesn’t mean public ownership. The platform insists that large-scale scraping without permission undermines user trust, violates API terms, and directly affects its growing AI licensing business, which reportedly contributes nearly 10% of Reddit’s total revenue.

As for AWMProxy, the company has not released any official statement, leaving its alleged involvement unclear.


Why the Reddit Sues Perplexity Case Matters

This lawsuit isn’t just a corporate dispute—it’s a defining moment for AI ethics and data ownership. As AI models require massive amounts of human-generated content to function, companies like Reddit are beginning to realize the real monetary and moral value of their platforms’ communities.

By taking a public stand, Reddit joins a growing list of content publishers challenging the AI industry’s “free data” culture. Similar lawsuits have already emerged from The New York Times and Simon & Schuster, all arguing that AI companies can’t build billion-dollar technologies using unpaid, user-created content.

The Reddit sues Perplexity case also raises difficult questions:

  • Where does fair use end and data theft begin?

  • Should AI companies pay platforms and creators for the content that powers their intelligence?

  • And how do we protect the open web without stifling AI innovation?


Looking Ahead: Setting the Tone for AI Accountability

As the lawsuit unfolds in U.S. federal court, its outcome could set a major precedent for AI governance worldwide. If Reddit succeeds, AI developers might be forced to adopt stricter data sourcing standards, sign formal licensing deals, or even compensate platforms retroactively for past scraping.

For Perplexity, this case could define its credibility as an AI-driven search engine in a market increasingly focused on transparency and ethical data use. For Reddit, it’s an opportunity to reinforce that its users’ voices—and their content—hold real value that deserves protection.

At its core, Reddit sues Perplexity isn’t just about one AI model or one platform. It’s about who owns the internet’s collective intelligence. In a world where every post, comment, and meme could train an AI, this battle signals that the era of unregulated data scraping might finally be coming to an end.

Leave a Comment