Tech News

When Serverless Stops Helping: Unkey's 6x Performance Win

A developer platform rebuilt its API authentication service from Cloudflare Workers to stateful Go servers, cutting latency by 6x. Here's what they learned about serverless limits.

5 min readJanuary 1, 2026

When your API sits in the request path for thousands of applications, every millisecond compounds. Unkey, an API authentication platform, just learned this the hard way—and their solution was to abandon serverless entirely.

The company rebuilt its entire authentication service from scratch, moving from Cloudflare Workers to stateful Go servers. The result: a sixfold reduction in latency and the elimination of what co-founder Andreas Thomas called a "complexity tax" that had consumed their engineering bandwidth.

This isn't just another migration story. It's a data point in a growing conversation about when serverless computing stops being an accelerant and becomes a constraint.

The Physics Problem

Unkey's breaking point was caching. Their goal was simple: respond to authentication requests in under 10 milliseconds total. But Cloudflare's cache was taking more than 30 milliseconds at the 99th percentile—before they'd even started processing the request.

The root cause was architectural. Serverless functions are stateless by design. They spin up, handle a request, and disappear. Any cached data must live elsewhere and be retrieved over the network. According to Thomas, they tried everything: multi-tier caching systems using various Cloudflare services, optimized cache keys, tuned expiry times.

None of it worked. As Thomas put it: "Zero network requests are always faster than one network request. No amount of stacking external caches could get us around this fundamental limitation."

A stateful server solves this by default. Hot data sits in memory. There's no network hop, no cache miss penalty, no clever workarounds needed.

The Complexity Tax

The caching problem was just the beginning. Unkey ran into a second issue that should be familiar to anyone who's scaled serverless: event handling.

In a traditional server, you batch events in memory and flush them every few seconds. But serverless functions might vanish the moment they finish processing, so you have to flush on every invocation. For Unkey, this meant building chproxy, a custom Go proxy designed to buffer analytics events before sending them to ClickHouse, which performs poorly with thousands of small inserts.

They also built a separate pipeline for metrics and logs, routing them through intermediate Cloudflare Workers that parsed and split events before forwarding them on. The architecture sprawled: Durable Objects, Logstreams, Queues, Workflows, and several custom stateful services.

According to the InfoQ coverage, Thomas said the team "spent more time evaluating and integrating new SaaS products than building features." They were solving problems the serverless architecture had created, not problems their customers had.

What They Built Instead

The new system runs on AWS Fargate with Global Accelerator in front. It still distributes traffic globally, but now long-lived Go processes hold data in memory and naturally batch events. All the auxiliary services supporting the serverless architecture disappeared. The code got simpler. The bill went down.

The migration also unlocked features that were nearly impossible before. Self-hosting, for example. The Cloudflare Workers runtime is open source in theory, but getting it to run locally is non-trivial. With a monolithic Go application, customers can spin up the product with a single Docker command. Engineers can run the whole stack on their laptops in seconds without working around Cloudflare-specific APIs.

Unkey is planning to launch a deployment platform next year that will let customers run the service wherever they want, adding portability and flexibility that their serverless architecture couldn't support.

The Pattern Emerges

Unkey isn't alone. Amazon Prime Video made headlines with a similar migration, consolidating its video quality monitoring from a distributed serverless system into a single process and reducing infrastructure costs by more than 90 percent. That case study caused a stir because it came from Amazon itself—a company that helped popularize serverless through AWS Lambda.

The pattern is consistent: high-volume workloads with tight coupling between components hit a wall with serverless. The pay-per-invocation model stops making economic sense. The stateless architecture overcomplicates simple operations.

Serverless consultant Yan Cui responded to Unkey's announcement on LinkedIn by asking when serverless stops helping, noting that an architecture suited to early growth can become a constraint later. Engineer Luca Maraschi was more blunt, suggesting teams examine their traffic patterns closely rather than assuming edge or serverless functions are the best fit for every workload.

What This Means for You

Serverless remains valuable for the right workloads. Event-driven systems, intermittent traffic, services that don't need persistent state—these are still good fits. The cost savings and operational simplicity can be significant.

But Unkey and Prime Video show the limits. If you're building a high-throughput service with strict latency requirements, serverless might be working against you. The warning signs:

You're building elaborate caching layers to work around statelessness

You're spending more time on infrastructure glue than features

Network hops to external services are eating your latency budget

You're batching or buffering data through custom proxies

Self-hosting or local development has become prohibitively complex

The serverless-by-default mindset made sense when the alternative was managing servers. But with modern deployment platforms like Fargate, Railway, or Fly.io, running stateful applications isn't the operational burden it used to be.

Unkey's story is a reminder that architecture decisions have trade-offs. Serverless bought them rapid deployment and global distribution early on. But as their requirements evolved—sub-10ms latency, in-memory caching, efficient event batching—the architecture that got them started became the architecture holding them back.

Sometimes the best path forward is a step back to fundamentals: keep your data close, minimize network hops, and don't solve problems you don't have. A 6x performance improvement suggests they made the right call.

When Serverless Stops Helping: Unkey's 6x Performance Win

The Physics Problem

The Complexity Tax

What They Built Instead

The Pattern Emerges

What This Means for You

More in Tech News

Open Source AI Models Challenge Proprietary Dominance

The Question Nobody Wants to Ask About AI Coding Tools

The Fine Print Microsoft Doesn't Want You to Read: Copilot Is Just 'Entertainment'