Cloudflare announced a new deadline for the AI industry to separate web crawlers used for traditional search purposes, such as Google Search, from web crawlers used for AI agents and training. Starting September 15, 2026, Cloudflare’s default settings will block “mixed-use” crawlers from all pages that host ads, the company announced Wednesday.
This means that crawlers that combine search, agent usage, and training are blocked by default from crawling these sites unless the site owner adjusts the settings. These changes to defaults will apply to new Cloudflare customers, new sites set up by existing customers, and all existing free customers, the company said.
This move could impact how AI model providers access web content for training purposes and power agent services.
Cloudflare points out that most website owners want their content to be discoverable via search and often through AI services, but they also want protection against intellectual property being made available for free.
Cloudflare specifically refers to the “world’s largest search engine” (obviously Google!) as having access to “nearly twice as much information” as other AI companies, making it difficult for the search giant to continue discovering customers without being exploited by AI.
Google has pushed back against this generalization in the past, pointing out that it offers a bot called Google Extended that allows site owners to opt out of having their content used for training and AI products and services like Gemini Apps and Vertex APIs. Using this will not affect whether your site is included in Google Search. However, Googlebot, the tech giant’s flagship product, crawls for searches that include AI features such as AI Overview and AI Mode.
In announcing the news, Matthew Prince, Cloudflare’s co-founder and CEO, cited the recent milestone of bots surpassing human online traffic for the first time, saying, “Now that the majority of traffic on the internet is non-human, we must move further and move faster to ensure a sustainable ecosystem can emerge.” This change was not expected to occur until next year.
“Cloudflare’s new tools and partnerships will bring visibility and commercial opportunities to website owners and benefit AI companies deploying bots with clear and transparent intent. We hope that the default changes we propose will encourage mixed-use crawlers to separate search from agent usage and training,” Prince said.
Cloudflare offers a number of products to help users launch their own AI systems, but the company has also released a variety of tools to give publishers more control over their content in the AI era. In recent years, Cloudflare has launched tools to combat AI bots, including a marketplace called Pay Per Crawl that allows websites to charge AI bots for scraping.
The latter is now evolving into “Pay Per Use,” the company says, allowing publishers to charge AI companies not just when the content is acquired, but when the content creates value.
This change could also help save bandwidth for publishers and computing resources for AI model providers, as Cloudflare data suggests that more than 50% of crawl traffic from AI crawlers is spent refetching pages that have not changed.
To do this, Cloudflare is initially working with two partners: Ceramic.ai and You.com. Once a publisher opts in, they will be paid when their content appears in Ceramic’s AI search results or when You.com accesses a piece of their premium content.
Cloudflare says other AI companies can also customize the model to fit their own systems.
If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.
