Reddit sues Perplexity and three data‑scraping firms over unlicensed use of Reddit content
Reddit has filed suit against Perplexity, SerApi, OxyLabs and AWMProxy, alleging the companies scraped Reddit content from Google search results without a license, then used or sold that data. The complaint seeks financial damages and a permanent injunction to stop the sale or reuse of previously scraped Reddit material.
Why Reddit is suing
- Paid access model: Since 2023, Reddit has charged for API/data access and has inked licensing deals with Google and OpenAI.
- End‑run via search: Reddit alleges the defendants avoided payments by scraping Google’s indexed results for Reddit posts, then feeding that data to products and customers.
- Evidence cited: Reddit says a hidden “test post”—crawlable only by Google—surfaced in Perplexity’s answer engine within hours, implying scraping of search results.
Perplexity’s response
Perplexity said it hadn’t yet received the lawsuit but will “fight vigorously for users’ rights to freely and fairly access public knowledge,” adding that its approach is “principled and responsible.”
Context: The data‑licensing land grab
- Reddit’s defenses: In 2024, Reddit rate‑limited unknown bots and crawlers; in Aug 2025 it curtailed Wayback Machine access; it also adopted Really Simple Licensing (RSL) to attach licensing terms to robots.txt.
- Licensing trend: Publishers and platforms increasingly seek compensation from AI firms training on their content; Reddit positions itself as a paid data provider and is building its own AI answer features.
What’s at stake
- For AI firms: A ruling could clarify whether scraping search results for UGC without a direct license risks liability.
- For platforms: Validates (or limits) efforts to monetize data and control crawling beyond robots.txt.
- For users: Potential shifts in how community content is accessed, indexed and monetized.
What happens next
- Defendants will respond; the court may consider Reddit’s request for an injunction.
- Watch for discovery around scraping methods, robots.txt compliance and any use in model training.
References: Initial report · Case summary
Discussion: Is scraping search results for user‑generated content fair use—or should platforms be paid whenever AI companies ingest that data?