The new bot wave doesn't run scripts. It reasons. ChatGPT Atlas, Claude Computer Use, OpenAI's Operator, Browser Use, MultiOn — every one of them sends an LLM into a real browser session with a goal in natural language ("book me the cheapest flight Tuesday", "fill this insurance form using my profile", "find a 2-bed in Berlin under €1500"). They click, they type, they navigate, they retry. Legacy bot defenses miss them because the legacy assumption was that bots are scripts and scripts are stupid. These aren't. Here's how to detect them, and the policy decision your team needs to make in the next quarter.

If you're a SaaS, an e-commerce site, an airline, a job board, or a public-data API, you're already getting agentic AI traffic in volume. Most teams haven't measured it because their detection stack reports it as "human" — which is exactly the bucket the agent operators want it in.

The agentic browser stack, today

The major commercial agents as of mid-2026:

  • ChatGPT Atlas — OpenAI's consumer-facing agentic browser. Runs Chromium; controlled by GPT-5 / GPT-5-mini depending on plan tier. Identifiable User-Agent ("ChatGPT-User" plus a Chromium UA string in current builds).
  • OpenAI Operator — separate product, runs in OpenAI-hosted virtual environments; egress IPs are OpenAI-owned ASN ranges.
  • Claude Computer Use — Anthropic's API-driven computer-use surface. Runs in customer-managed environments (developer responsibility for browser config). User-agent varies depending on host setup.
  • Browser Use (open-source library) — Python framework that wraps Playwright with LLM action selection. Egress is whatever the developer configures; UA is patched to look like real Chrome unless explicitly overridden.
  • MultiOn, AutoGPT-Browser, AgentGPT, Skyvern, Lavague — adjacent open-source / startup tooling, varying maturity. Lower volume but visible in instrumented data.

The traffic split we observe (May 2026): ChatGPT Atlas dominates by volume (roughly 60% of identified agentic traffic), Operator second (~15%), Claude-driven sessions third (~10%), open-source frameworks distributed across the remainder.

Why legacy bot detection fails

Three structural reasons:

  1. The behavior is plausible. The agent reads the page, decides on an action, executes it, observes the result. The pacing is human-like. The action sequence is novel per-session because the LLM generates it. Pattern-matching against "scripts that follow fixed sequences" doesn't fire.
  2. The browser is real. ChatGPT Atlas is Chromium with a controller wrapped around it. The GPU pipeline, the rendering, the JavaScript engine — all genuine Chromium. Headless-detection signatures don't apply because it's not running headless.
  3. The input events are physically plausible. Mouse movement, scroll, click, keystroke — all dispatched through the real browser input stack. Input-event entropy is similar to a human's, just with different distributions.

The signals that worked against Selenium and Puppeteer don't carry. New signals are needed.

The signals that catch agentic AI

1. Origin / identity (cheap, exact)

Commercial agents run on identifiable infrastructure with identifiable user-agents:

  • OpenAI's ChatGPT-User UA is documented and consistent. Match on UA + ASN intersection.
  • OpenAI Operator egresses from OpenAI ASN ranges (largely AS56947 and adjacent). The IP block is finite and updateable.
  • Claude Computer Use egresses from wherever the developer hosts; the UA pattern when running headlessly via Anthropic's reference implementation is recognizable.

This is the cheapest detection layer. Run it first. It catches the well-behaved commercial agents in milliseconds with zero false positives.

2. LLM-decision-pattern signature (medium, reliable)

Agents driven by LLMs leave decision-flow signatures:

  • Action-then-pause-then-revise. An LLM agent commits to an action, observes the result, sometimes recognizes it was wrong, and retries or backs out. Real users do this too, but with a different temporal signature — the agent's "thinking pause" between observation and corrective action is consistent (LLM inference latency, ~1.5–6 seconds depending on model and prompt size). Real users vary much more.
  • Novel-but-not-creative paths. An agent will navigate via the most-direct visible link to a goal. Real users wander, get distracted, hit the back button, open a second tab. Agent path through a multi-step funnel is unusually direct.
  • Form-fill over-completeness. LLMs tend to fill optional fields when they have data, where humans skip optionals more often. The pattern of every optional field being filled is a soft agent signal.
  • Textarea over-explanation. Free-text fields filled by an agent often contain LLM-characteristic phrasing (well-formed paragraphs, no typos, no run-ons, complete sentences in contexts where humans use fragments).

3. Tool-use traces and prompt-injection probes

Some agents leave traces of their toolchain in headers, request fingerprints, or specific behaviors. ChatGPT Atlas honors specific accessibility-API hooks. Operator-driven sessions have specific viewport and DPI constellations. Browser-Use sets specific cookies during its initialization that don't match Playwright defaults. These are micro-signals that combine well with the macro signals.

You can also actively probe: prompt-injection-style strings hidden in page metadata that, if echoed back in subsequent agent traffic, identify an agent that consumed and re-emitted the injection. Use sparingly and ethically — this is a research tool more than a production tool.

4. Behavioral baselines that LLMs don't match

  • Scroll patterns. Humans scroll-and-rest, scroll-and-rest. LLM agents scroll deterministically when the LLM decides to "see more content", with very different rhythm.
  • Hover-without-click rates. Humans hover items they're considering. Agents don't — the LLM's decision is internal and emerges as a click.
  • Multi-tab behavior. Humans open multiple tabs. Agents (currently) drive one window per session.
  • Idle time. Humans have idle time within a session (read, think, look away). Agents have inference-bound thinking time only.

The policy decision: block, allow, or attribute

Detection is the easy half. The hard half is knowing what to do once you've detected. Three reasonable postures:

Block

Default-block on payment, signup, account creation, password reset, payout-method change, and any high-value action surface. The agent operators have not yet built consent or identity-binding flows that protect the underlying user; allowing agentic AI through these surfaces creates real fraud and consent exposure.

Allow with attribution

Allow on public-information surfaces (marketing pages, product info, help docs, blog content, status pages). Attribute via a specific log/header so analytics doesn't conflate agent visits with human conversions. Agent-driven traffic that finds your product organically and recommends it to a user is a positive — but only if you can measure it separately.

Build an explicit agent path

The endgame for any platform that's important enough to be a frequent agent destination. Provide an authenticated agent-API endpoint, with appropriate identity binding (the user delegates an agent's access via a token), rate limits, and a separate logging surface. The user's identity and consent flow through cleanly; the platform's fraud surface stays clean. Agentforce, OpenAI Apps SDK, and Anthropic's MCP ecosystem are converging on this shape.

Most teams will run all three concurrently — block by default at high-value surfaces, allow + attribute on content surfaces, and add the explicit agent path on the surfaces where agent traffic is high enough to be worth the engineering investment. Don't pretend you can pick one.

What we ship at Sentinel

Sentinel's evaluate endpoint includes signals specific to agentic AI traffic in the device-intel response:

  • aiAgentDetected — boolean. True for any of the major commercial or open-source agents we identify.
  • aiAgentProduct — when known: chatgpt-atlas, openai-operator, claude-computer-use, browser-use, etc.
  • aiAgentConfidence — float 0–1. High when origin signal matches; medium when only behavioral signals fire.
  • botCategory — when applicable: scraper | automation | agentic-ai | headless-browser. Lets you route to different policies cleanly.

This sits alongside the existing automationDetected (which catches Puppeteer/Playwright/Selenium), antidetectBrowser (which catches Multilogin/Kameleo/etc.), and residentialProxy. The four together cover the modern bot taxonomy comprehensively.

Free tier (1,000 requests/hour) is more than enough to instrument a public site or product surface for two weeks and measure exactly how much agentic-AI traffic you're getting and what it's doing. Most teams find 2–8% of their non-bot traffic is agentic AI as of mid-2026 — and the trend is up-and-to-the-right at a steep angle.

The next 12 months

A few well-supported predictions:

  • Agentic AI traffic will cross 15–25% of total non-search-engine traffic on consumer-facing sites by Q2 2027. Some verticals (travel, e-commerce, jobs) will hit that earlier.
  • Agent operators will start spoofing identity to evade unfriendly sites. The current well-behaved-UA pattern is a soft consensus that won't survive contact with sites that block it. Expect a Genesis-Market-style market for "stealth agentic-AI runtimes" within 18 months.
  • The platforms that ship agent-aware paths early will own the agent-recommended-products distribution channel. The platforms that don't will be intermediated by agents using older flows that bypass their analytics and their commerce.
  • The legacy bot-detection vendors will be late. Their detection signatures are designed for "scripts following sequences"; agentic traffic doesn't trigger them and they have no behavioral signal to catch it. The detection axis has shifted and their roadmaps haven't.

Agentic AI is not a future fraud category. It's a current traffic category that legacy fraud APIs misclassify as human and ad analytics platforms misclassify as conversions. Detect it, decide your policy per surface, and ship the explicit agent path on the surfaces where the traffic is real. The teams that handle this in the next two quarters will own a real distribution edge for the next several years.