Passer au contenu
Article

Agents IA derrière Cloudflare : comment arrêter les boucles reCAPTCHA sur votre site

AI agents behind Cloudflare quietly trigger bot challenges every time they touch the site. One morning in May 2026, that quiet problem became a loud one for us — sengo.com’s homepage was greeting visitors with a stuck reCAPTCHA overlay. Here’s exactly what happened, and the WAF rule pattern that fixed it permanently.

 
agents ia derrière cloudflare article de blogue

Agents IA derrière Cloudflare : comment arrêter les boucles reCAPTCHA sur votre site

AI agents behind Cloudflare quietly trigger bot challenges every time they touch the site. One morning in May 2026, that quiet problem became a loud one for us — sengo.com’s homepage was greeting visitors with a stuck reCAPTCHA overlay. Here’s exactly what happened, and the WAF rule pattern that fixed it permanently.

 

Pourquoi les agents IA derrière Cloudflare déclenchent des boucles reCAPTCHA

We’ve been running a fleet of AI agents against our own site — content audits, SEO scans, page generators, draft pushes. They all hit sengo.com over HTTPS, exactly like a regular visitor would. That worked fine for months. Then one morning, mobile visitors started reporting a Cloudflare-style overlay on the homepage with the message “This site is exceeding reCAPTCHA Enterprise free quota.”

Login still worked. Wp-admin was fine. But the public marketing pages were broken for a slice of real visitors. After ruling out our security plugin and our contact form’s CAPTCHA, we found the real source: our agents themselves. Every script we ran was, from Cloudflare’s perspective, a robot — and a robot that came back hundreds of times a day from a single residential IP.

That’s the first lesson about AI agents behind Cloudflare: your own automation is hostile traffic by default. Cloudflare doesn’t know your laptop’s curl call is friendly. It sees the same fingerprint patterns it sees from credential stuffing tools and scrapers — and it treats them the same way.

 

How Cloudflare’s Bot Protection Sees Agent Traffic

Cloudflare’s bot management runs on layered signals — User-Agent strings, JA3/JA4 TLS fingerprints, header order, behavioral patterns, and IP reputation. The naive defenses you might reach for fail predictably.

  • Spoofing User-Agent isn’t enough. Cloudflare correlates UA with TLS fingerprint. A request that claims to be Chrome but uses curl’s TLS profile is flagged instantly.
  • Rotating IPs doesn’t help small teams. Les proxys résidentiels sont faciles à détecter, et un seul développeur martelant une IP voit sa réputation marquée en quelques heures.
  • Même les bibliothèques légitimes sont évidentes. Python requests, Node fetch, Go’s http.Client — each ships a recognizable TLS handshake.

On the Pro plan, this enforcement is called Super Bot Fight Mode. It classifies traffic into Definitely automated, Likely automatedet Verified bots, and lets you choose an action for each. By default, “Definitely automated” gets a Managed Challenge — which means your agents are blocked behind a CAPTCHA they can’t solve.

 

La cascade de défi mis en cache : pourquoi cela a affecté de vrais visiteurs

Here’s the part most teams miss. When Cloudflare issues a Managed Challenge, the challenge page uses Google reCAPTCHA Enterprise behind the scenes — but it doesn’t use your Google Cloud project. It uses Cloudflare’s own shared reCAPTCHA key, which is pooled across all their customers.

That pooled key has a free-tier quota. When agent traffic across many sites adds up, the quota gets exhausted, and Google starts refusing to issue tokens. The challenge page then renders with the message “This site is exceeding reCAPTCHA Enterprise free quota.”

So far, just a degraded challenge experience for bots. The real damage comes next: on the Pro plan, Cloudflare’s Automatic Platform Optimization (APO) for WordPress caches HTML responses at the edge — including, in some edge cases, the broken challenge page itself. A real human visitor, hitting the same Cloudflare PoP, gets served that cached broken page. Suddenly your homepage is unusable, and the symptom looks unrelated to anything you’ve done.

This is why agent hygiene matters even if you don’t care about scraper protection. As a result, the actions you take to keep AI agents behind Cloudflare working cleanly have a direct effect on the experience of every real visitor to your marketing site.

 

La solution : une règle WAF Skip et un en-tête secret

The clean solution is a private credential that agents present on every request, and a Cloudflare WAF rule that recognizes it and bypasses the bot management layer. It’s the same pattern Cloudflare itself recommends for trusted-source automation.

Le mécanisme est simple :

  1. Générez un long jeton secret aléatoire. Traitez-le comme une clé API : stockez-le dans un fichier .env ignoré par Git, ne le commitez jamais.
  2. Make every agent script send that token in a custom header like X-Agent-Token.
  3. Configure a Cloudflare WAF custom rule: if the header matches the secret value, Skip all security features (WAF managed rules, Super Bot Fight Mode, rate limiting).

Comme la règle dépend d'un secret privé plutôt que d'une IP ou d'un User-Agent, elle survit aux changements de portable, à la rotation VPN et au télétravail. Et comme elle saute la sécurité plutôt que de la désactiver, les vrais visiteurs bénéficient toujours de la pleine enveloppe de protection.

For a fuller treatment of where this sits in the broader Cloudflare expression language, the Cloudflare WAF custom rules documentation is the canonical reference.

 

Implémenter la liste d'autorisation pour agents IA Cloudflare

Here’s the concrete five-step recipe, drawn from how we ship this on sengo.com.

  1. Générez le secret. 32 random bytes is plenty. In PowerShell: [System.BitConverter]::ToString([System.Security.Cryptography.RandomNumberGenerator]::GetBytes(32)).Replace('-','').ToLower(). Save the output as AGENT_TOKEN in your .env.local (gitignored).
  2. Ajoutez la règle WAF Cloudflare. Dashboard → your site → Security → WAF → Custom Rules → Create. Expression: (http.request.headers["x-agent-token"][0] eq "<your-secret-token>"). Action: Skip. Check every box: WAF managed rules, rate limiting rules, Super Bot Fight Mode, all remaining custom rules. Deploy.
  3. Branchez l'en-tête dans vos scripts d'agent. Any script that hits the site needs to send the header. For shell: curl -H "X-Agent-Token: $AGENT_TOKEN" …. For Python: add {"X-Agent-Token": os.environ.get("AGENT_TOKEN", "")} to the request headers dict. For a Node script using fetch: include it in the en-têtes option.
  4. Documentez la convention dans votre dépôt. A short note in CLAUDE.md, README, or your agent’s system prompt instructing future scripts to include the header. Without it, new automation will silently get challenged again.
  5. Optionnellement, ajoutez une liste d'autorisation de bots vérifiés pour les robots d'indexation IA légitimes. If you want ClaudeBot, GPTBot, PerplexityBot, and similar to reach your content for citation purposes, add a second rule that Skips when cf.client.bot is true and the User-Agent matches one of those names. The cf.client.bot field is Cloudflare’s cryptographically verified bot signal — it can’t be spoofed.

 

Vérifier que la règle Cloudflare pour agents IA fonctionne

We’ve found a simple two-call test catches every misconfiguration. From any machine, with a bot-like User-Agent:

  • Request 1 — without the token: curl -A "curl-test/1.0" https://<your-site>/. Expected: HTTP 403, response title <title>Just a moment...</title>, and a response header cf-mitigated: challenge.
  • Request 2 — with the token: curl -A "curl-test/1.0" -H "X-Agent-Token: $AGENT_TOKEN" https://<your-site>/. Expected: HTTP 200, full page body, no cf-mitigated header.

If both return 200, your bot mode might be set to Allow somewhere — check Super Bot Fight Mode in the Security → Bots dashboard. Specifically, if both return 403, your rule isn’t deployed yet or the header name is mismatched.

After 24 to 48 hours of normal traffic, the Security → Events log will show the rule firing with each agent request. Filter by your rule name and confirm that real automation is matching it. In short, if you see your script’s calls in the Skip log, you’re done.

 

Leçons pour exploiter des agents IA en production

This incident taught us a few things we’d apply to any team running AI agents behind Cloudflare or any comparable edge layer.

  • Don’t evade detection — work with the protection layer. La falsification de User-Agent et la rotation d'IP sont des contournements de courte durée. Une liste d'autorisation basée sur un identifiant privé est durable et auditable.
  • Planifiez les effets de cascade de cache. La mise en cache périphérique (APO, LiteSpeed, CDN) peut transformer un incident bot ponctuel en une panne pour de vrais visiteurs. Tenez toujours compte du chemin entre le défi en amont et la livraison mise en cache en aval.
  • Use Cloudflare’s verified-bot signal for real crawlers. Spoofable UA matching gives anyone a free pass. The verified-bot field cf.client.bot is what you want when allowlisting ClaudeBot, GPTBot, Bingbot, and the rest.
  • Faites tourner le jeton d'agent comme n'importe quel autre secret. Une rotation trimestrielle est raisonnable. Mettez à jour la règle WAF et votre fichier .env de manière synchronisée.
  • Informez vos agents de la convention. Whether it’s a CLAUDE.md note or a documented coding standard, the rule has to outlive the developer who set it up.

Le patron plus large ici est que les agents IA derrière Cloudflare — ou toute couche périphérique en production — méritent un traitement opérationnel de première classe, pas des bricolages a posteriori. Traitez votre automatisation comme une partie de votre infrastructure, donnez-lui des identifiants et auditez son accès comme vous le feriez pour un membre humain de votre équipe.

 

Comment Sengo aide les équipes à exploiter les agents IA en toute sécurité

We’ve been on both sides of this problem — building AI agents that work against client sites, and protecting client sites against bad automation. The pattern in this article comes straight from a real incident on our own site, and it’s the same blueprint we deploy for clients integrating Claude Code, custom AI agents, or third-party AI search crawlers into their Cloudflare-protected infrastructure.

If your team is starting to run AI agents against your own production site — or you’re worried that AI search crawlers are getting challenged and your content isn’t being cited — that’s a problem we can size up quickly. Our permettre aux équipes avec IA solution covers exactly this kind of operational hardening for sites already in production. We’ve helped clients across financial services, higher education, and retail get their agent and crawler traffic on the right footing without weakening their security posture.

 

Réservez un appel-conseil gratuit de 30 min

Sources et références

  1. Cloudflare WAF Custom Rulesdevelopers.cloudflare.com
  2. Cloudflare Super Bot Fight Mode (Pro Plan)developers.cloudflare.com
  3. Cloudflare Verified Botsdevelopers.cloudflare.com
  4. Cloudflare Bot Management Overviewdevelopers.cloudflare.com
  5. reCAPTCHA Enterprise Quotas and Limitscloud.google.com
Sengo Robot Nikko