Cloudflare shakes up the rules on AI content with new policy

By Cosmin T.
- 7 hours ago

Cloudflare’s new policy lets website owners control how AI bots use their content beyond robots.txt.
The change forces Google to separate its crawlers or risk missing vast parts of the web for AI and search.
This move gives sites more legal power and challenges how tech giants collect data for AI models.

Cloudflare has rolled out a fresh policy aimed squarely at how artificial intelligence companies collect content from websites.

The new Content Signals Policy, as Cloudflare calls it, allows web publishers and creators to set more detailed boundaries around what AI bots can and cannot use. Until now, website administrators mostly relied on an old protocol known as robots.txt, a sort of digital handshake that tells bots which parts of a site they should ignore. But the explosion of AI-powered search tools has revealed the limits of this approach, as many AI crawlers find ways around it.

More than 3.8 million domains already utilize Cloudflare’s robot protection tools. This latest policy further sharpens how content can be licensed and what data these AI bots are allowed to collect, especially when it comes to Google’s ambitious AI-powered search projects.

Cloudflare’s CEO Matthew Prince pointed the finger at Google, arguing that the tech giant’s approach to crawling the web for both standard search and newer generative AI tools gives it an unfair leg up over competitors. “Every AI answer engine should have to play by the same rules,” Prince said.

Legal Tension and New Choices for Google

By applying this new license across its massive network — nearly 20 percent of the internet by some estimates — Cloudflare is creating a fork in the road for Google. Moving forward, Google faces a choice: either untangle its crawlers, splitting them so AI engines do not feast on sites that do not want them, or risk missing out on a huge chunk of the web for both search results and AI processing.

Prince claimed the updated license is not just a suggestion, but a digital contract that could carry legal heft. “Google’s legal team will see this for what it is — a contract with legal ramifications if they ignore it,” he warned. In practical terms, this means millions of sites under Cloudflare’s umbrella are about to get more say over whether they feed the AI boom.

The policy also highlights the sharp contrast in how tech leaders approach scraping the web. While Prince called out Google for blending its bots, he pointed to OpenAI as an example of more responsible behavior: OpenAI maintains separate crawlers for its AI and search-related tasks.

The Content Signals Policy is not just about blocking bots. It gives website owners the power to draw fine lines: they can specify whether their pages may be included in traditional search, in AI model training, or in the growing world of AI-generated answers and summaries.

With this move, Cloudflare asserts that the old gentleman’s agreement of robots.txt is no longer enough. The company wants to turn technical preferences into signals with real legal stakes. As Prince put it, “The internet cannot wait for a solution while in the meantime, creators’ original content is used for profit by other companies.”

Google, for its part, keeps insisting that its AI search still sends meaningful traffic to sites, and the company contends it cares about the web’s health. Yet, the landscape is shifting as the tools and rules that govern artificial intelligence get rewritten in real time.