AI agents behind Cloudflare quietly trigger bot challenges every time they touch the site. One morning in May 2026, that quiet problem became a loud one for us — sengo.com’s homepage was greeting visitors with a stuck reCAPTCHA overlay. Here’s exactly what happened, and the WAF rule pattern that fixed it permanently.
Jean-Nicolas Gauthier
AI agents behind Cloudflare quietly trigger bot challenges every time they touch the site. One morning in May 2026, that quiet problem became a loud one for us — sengo.com’s homepage was greeting visitors with a stuck reCAPTCHA overlay. Here’s exactly what happened, and the WAF rule pattern that fixed it permanently.
We’ve been running a fleet of AI agents against our own site — content audits, SEO scans, page generators, draft pushes. They all hit sengo.com over HTTPS, exactly like a regular visitor would. That worked fine for months. Then one morning, mobile visitors started reporting a Cloudflare-style overlay on the homepage with the message “This site is exceeding reCAPTCHA Enterprise free quota.”
Login still worked. Wp-admin was fine. But the public marketing pages were broken for a slice of real visitors. After ruling out our security plugin and our contact form’s CAPTCHA, we found the real source: our agents themselves. Every script we ran was, from Cloudflare’s perspective, a robot — and a robot that came back hundreds of times a day from a single residential IP.
That’s the first lesson about AI agents behind Cloudflare: your own automation is hostile traffic by default. Cloudflare doesn’t know your laptop’s curl call is friendly. It sees the same fingerprint patterns it sees from credential stuffing tools and scrapers — and it treats them the same way.
Cloudflare’s bot management runs on layered signals — User-Agent strings, JA3/JA4 TLS fingerprints, header order, behavioral patterns, and IP reputation. The naive defenses you might reach for fail predictably.
requests, Node fetch, Go’s http.Client — each ships a recognizable TLS handshake.On the Pro plan, this enforcement is called Super Bot Fight Mode. It classifies traffic into Definitely automated, Likely automatedet Verified bots, and lets you choose an action for each. By default, “Definitely automated” gets a Managed Challenge — which means your agents are blocked behind a CAPTCHA they can’t solve.
Here’s the part most teams miss. When Cloudflare issues a Managed Challenge, the challenge page uses Google reCAPTCHA Enterprise behind the scenes — but it doesn’t use your Google Cloud project. It uses Cloudflare’s own shared reCAPTCHA key, which is pooled across all their customers.
That pooled key has a free-tier quota. When agent traffic across many sites adds up, the quota gets exhausted, and Google starts refusing to issue tokens. The challenge page then renders with the message “This site is exceeding reCAPTCHA Enterprise free quota.”
So far, just a degraded challenge experience for bots. The real damage comes next: on the Pro plan, Cloudflare’s Automatic Platform Optimization (APO) for WordPress caches HTML responses at the edge — including, in some edge cases, the broken challenge page itself. A real human visitor, hitting the same Cloudflare PoP, gets served that cached broken page. Suddenly your homepage is unusable, and the symptom looks unrelated to anything you’ve done.
This is why agent hygiene matters even if you don’t care about scraper protection. As a result, the actions you take to keep AI agents behind Cloudflare working cleanly have a direct effect on the experience of every real visitor to your marketing site.
The clean solution is a private credential that agents present on every request, and a Cloudflare WAF rule that recognizes it and bypasses the bot management layer. It’s the same pattern Cloudflare itself recommends for trusted-source automation.
Le mécanisme est simple :
X-Agent-Token.Comme la règle dépend d'un secret privé plutôt que d'une IP ou d'un User-Agent, elle survit aux changements de portable, à la rotation VPN et au télétravail. Et comme elle saute la sécurité plutôt que de la désactiver, les vrais visiteurs bénéficient toujours de la pleine enveloppe de protection.
For a fuller treatment of where this sits in the broader Cloudflare expression language, the Cloudflare WAF custom rules documentation is the canonical reference.
Here’s the concrete five-step recipe, drawn from how we ship this on sengo.com.
[System.BitConverter]::ToString([System.Security.Cryptography.RandomNumberGenerator]::GetBytes(32)).Replace('-','').ToLower(). Save the output as AGENT_TOKEN in your .env.local (gitignored).(http.request.headers["x-agent-token"][0] eq "<your-secret-token>"). Action: Skip. Check every box: WAF managed rules, rate limiting rules, Super Bot Fight Mode, all remaining custom rules. Deploy.curl -H "X-Agent-Token: $AGENT_TOKEN" …. For Python: add {"X-Agent-Token": os.environ.get("AGENT_TOKEN", "")} to the request headers dict. For a Node script using fetch: include it in the en-têtes option.cf.client.bot is true and the User-Agent matches one of those names. The cf.client.bot field is Cloudflare’s cryptographically verified bot signal — it can’t be spoofed.
We’ve found a simple two-call test catches every misconfiguration. From any machine, with a bot-like User-Agent:
curl -A "curl-test/1.0" https://<your-site>/. Expected: HTTP 403, response title <title>Just a moment...</title>, and a response header cf-mitigated: challenge.curl -A "curl-test/1.0" -H "X-Agent-Token: $AGENT_TOKEN" https://<your-site>/. Expected: HTTP 200, full page body, no cf-mitigated header.If both return 200, your bot mode might be set to Allow somewhere — check Super Bot Fight Mode in the Security → Bots dashboard. Specifically, if both return 403, your rule isn’t deployed yet or the header name is mismatched.
After 24 to 48 hours of normal traffic, the Security → Events log will show the rule firing with each agent request. Filter by your rule name and confirm that real automation is matching it. In short, if you see your script’s calls in the Skip log, you’re done.
This incident taught us a few things we’d apply to any team running AI agents behind Cloudflare or any comparable edge layer.
cf.client.bot is what you want when allowlisting ClaudeBot, GPTBot, Bingbot, and the rest.Le patron plus large ici est que les agents IA derrière Cloudflare — ou toute couche périphérique en production — méritent un traitement opérationnel de première classe, pas des bricolages a posteriori. Traitez votre automatisation comme une partie de votre infrastructure, donnez-lui des identifiants et auditez son accès comme vous le feriez pour un membre humain de votre équipe.
We’ve been on both sides of this problem — building AI agents that work against client sites, and protecting client sites against bad automation. The pattern in this article comes straight from a real incident on our own site, and it’s the same blueprint we deploy for clients integrating Claude Code, custom AI agents, or third-party AI search crawlers into their Cloudflare-protected infrastructure.
If your team is starting to run AI agents against your own production site — or you’re worried that AI search crawlers are getting challenged and your content isn’t being cited — that’s a problem we can size up quickly. Our permettre aux équipes avec IA solution covers exactly this kind of operational hardening for sites already in production. We’ve helped clients across financial services, higher education, and retail get their agent and crawler traffic on the right footing without weakening their security posture.
Comme (0)