
Cloudflare blocks AI crawlers by default and launches Pay Per Crawl for publishers
Cloudflare has rolled out a default policy to block known artificial intelligence web crawlers, aiming to prevent the unapproved collection and use of website content by AI companies. Under the new approach, domain owners setting up a site on Cloudflare are prompted to specify whether or not to permit AI crawler access, giving users immediate control over data scraping activities.
Cloudflare is also launching a private beta of its Pay Per Crawl program. This system allows selected publishers and content creators to charge AI scrapers a fee for accessing their data. Initially available only to a group of leading industry partners, participation is planned to expand through a dedicated signup portal or via Cloudflare representatives. AI companies are provided with transparent pricing and can opt in or decline access.
To support payment enforcement, Cloudflare is reviving HTTP status code 402 (Payment Required) and integrating it into its infrastructure. Website owners can set domain-wide rates or create specific rules to exempt certain crawlers. These settings are available in the Cloudflare dashboard, and technical users can implement policies and headers for programmatic control. Hybrid models are supported, allowing free access for partners and paid access for others. Future updates may add more detailed pricing controls and agent-driven paywalls.
Well, if too many "good content" websites decide to block AI scrapers (still, Cloudflare doesn't explain how it detects AI companies crawlers from other more legit crawlers), AI training and outputs will be even much worse, even more for small companies. It will be the same problem as it was with Google search engine last decade, where most websites would only allow Google to crawler their website, giving a monopoly on a single company (forcing other search engines to sometime crawl Google itself), then theses companies complain that Google were too powerful. Sure, mad crawlers that can DDoS websites is bad, should be limited or even banned, but giving how much AI companies are nowadays crawling the web as fast as possible, it's pretty the same result getting thousands of requests from a single company than dozen of from hundreds smaller ones, except the latter is nearly impossible to block (like by crawling from user devices). As for the promised "fair and transparent" AI training, we still don't have a single clue of what it's supposed to be. AI are still black boxes, by design.