How We Achieved Sub-40ms Global Latency

Fraud prevention shouldn't cost you conversions. Every 100ms of added latency at checkout drops conversion rates by 1%.

When building Sentinel, our core directive was speed. Legacy APIs like IPQS and MaxMind route all requests through central servers in the US or Europe, producing 200ms+ round trips for users in Asia or South America. For modern applications, that's unacceptable.

The Anycast Edge Architecture

We rewrote the rulebook by deploying our decision engine directly to the Edge. Using BGP Anycast routing, user telemetry is evaluated at the closest of our 250+ global POPs. A user in Singapore never touches a US server.

Rust at the Core

Our core assessment engine is written entirely in Rust, consuming less than 2MB of memory per instance. This allows us to load the entire ML model into L3 cache, eliminating disk I/O from the hot path entirely. The result: a global average response time of 38ms.

The Three Decisions That Bought 80% of Our Latency

Anycast over GeoDNS — DNS-based geo-routing has 30-90 minute propagation lag and frequently sends users to the wrong region. BGP Anycast routes to the topologically nearest POP at the IP layer, which means a Sydney user reaches our Sydney POP in one hop, not three.
No database in the hot path — every assessment that touched Postgres added 8-25ms. We rebuilt the decision engine to load the entire risk model into memory at process start, refresh it from a CRDT every 60 seconds, and never block on storage during evaluation.
HTTP/2 server push for the SDK token — by pushing the token-injection script bytes alongside the initial HTML, we eliminate the second round-trip on cold loads. First-paint risk score is available 18ms earlier on average.

What 40ms Buys You at Checkout

Stripe published data showing every 100ms of added checkout latency drops conversion by ~1%. A retailer doing $10M/year through checkout loses $100k for every 100ms they add. Most fraud APIs add 200-400ms. Sentinel adds 40ms.

For a customer doing $50M/year, that's a $1.6M difference per year compared to running IPQS or MaxMind synchronously in the checkout path.

The Tradeoff We Made

Sub-40ms means we can't do every check inline. Long-tail device-cluster lookups (the kind that catch the most sophisticated fraud) run async after the response. The first request from a new visitor returns a fast verdict on network + immediate device signals; the cluster verdict updates within ~300ms via the same SDK channel. For 99.4% of requests this matches inline; for the remaining 0.6% we accept a brief inconsistency window in exchange for not blocking checkout.

How to Test It Yourself

Run curl -w "%{time_total}\n" -o /dev/null -s -X POST https://sntlhq.com/v1/evaluate/sample?scenario=clean from any region. Median worldwide: 38ms. p99 from any continent: under 95ms. The sample endpoint requires no auth and uses the same edge path as the production API.

How We Achieved Sub-40ms Global Latency

The Anycast Edge Architecture

Rust at the Core

The Three Decisions That Bought 80% of Our Latency

What 40ms Buys You at Checkout

The Tradeoff We Made

How to Test It Yourself

Related Articles

How to Detect Antidetect Browsers in 2026

Best Free VPN Detection APIs in 2026

Bot Detection Without CAPTCHAs

Stop fraud before it hides.