Tech News

The Real Cost of Coding with LLMs: What Works, What Drains You

LLMs are now standard developer tools, but the honeymoon phase is over. Here's what actually works after thousands of hours of real-world use.

6 min readMarch 16, 2026

You're probably using an LLM to write code. The question isn't whether to adopt these tools anymore—it's how to use them without burning out.

Developers who've spent months building production systems with LLMs are reporting something unexpected: the tools work brilliantly until they don't. And when they don't, the cognitive drain is real.

The Shift Nobody Talks About

Stavros Stavropoulos has built entire production systems with LLMs—a personal assistant that manages his calendar, a voice note pendant, even an art piece masquerading as a wall clock. His revelation? "I thought that I liked programming, but it turned out that what I like was making things, and programming was just one way to do that," he writes on his blog.

His engineering skills didn't become obsolete. They shifted. He no longer needs to know how to write code correctly line-by-line. Instead, system architecture and making the right technical choices matter massively more.

Simon Willison, who's been documenting these changes, calls this new practice "agentic engineering"—developing software with coding agents that can write and execute code in a loop until a goal is met. His definition cuts through the hype: "Agents run tools in a loop to achieve a goal."

When LLMs Become Exhausting

Tom Johnell describes the darker side: "Some days I get in bed after a tortuous 4-5 hour session working with Claude or Codex wondering what the heck happened."

He's identified the doom loop. You're tired, so your prompts degrade. Worse prompts produce worse code. You interrupt the LLM mid-stream to add missing context. The feedback cycle slows to a crawl. Context windows bloat. The AI gets dumber or starts hallucinating about recent experiments.

Johnell calls it "doom-loop psychosis." And if you've worked with LLMs for more than a few weeks, you recognize it immediately.

What Actually Works

Use Multiple Models

Stavros is adamant: your tooling needs to support multiple models from different companies. "Most first-party harnesses (Claude Code, Codex CLI, Gemini CLI) will fail this, as companies only want you to use their models, but this is necessary," he writes.

Different models excel at different tasks. Lock yourself into one provider's ecosystem and you're handicapping yourself.

Recognize Your Mental State

Johnell's rule: "If I reach the point where I am not getting joy out of writing a great prompt, then it's time to throw in the towel."

Watch for these signals:

Being short and impatient in prompts

Interrupting the LLM mid-generation

Half-assing descriptions because you hope the AI will fill gaps

Not celebrating when you hit submit because you know it won't work

When you spot these, stop. The AI isn't broken. You are.

Fix Slow Feedback Loops First

Johnell had a parsing problem where each iteration took 15-20 minutes. Context bloated, results degraded, frustration mounted.

His solution: make the feedback loop itself the problem to solve. Start a new session specifically to reproduce the failure case in under five minutes. The AI will optimize the code path and create levers for faster iteration.

Sound familiar? It's test-driven development. Johnell admits he was always the scrappy engineer who skipped elaborate tests. With LLMs, that scrappiness kills productivity. "If you give an LLM clear success criteria," he notes, the AI will not only solve the problem but consume less context and stay smarter.

Understand Architecture Deeply

Stavros reports he's "never even read most" of the code in his projects, yet he's "intimately familiar with each project's architecture and inner workings."

On projects where he lacks domain knowledge (mobile apps), code quickly becomes a mess. On projects where he knows the technology well (backend apps), he maintains tens of thousands of lines with low defect rates.

The pattern is clear: LLMs don't replace technical understanding. They amplify it.

The Skills That Matter Now

Willison frames it well: "Writing code has never been the sole activity of a software engineer. The craft has always been figuring out what code to write."

Every software problem has dozens of solutions with different tradeoffs. Your job is navigating those options. The LLM executes. You architect, specify, verify, and iterate.

According to Willison, the new skillset includes:

Providing agents with the right tools

Specifying problems at the right level of detail

Verifying results with confidence

Deliberately updating instructions based on what you learn

LLMs don't learn from past mistakes. But your coding agent can, if you design it to.

The Emerging Reality

This isn't vibe coding—a term Andrej Karpathy coined in February 2025 to describe prompting LLMs while you "forget that the code even exists." That might work for prototypes, but production systems require something different.

Willison distinguishes carefully: "We need a term to describe unreviewed, prototype-quality LLM-generated code that distinguishes it from code that the author has brought up to a production ready standard."

The developers getting results aren't abandoning code review. They're reviewing at a different level—architecture instead of syntax, system design instead of function implementation.

What You Can Do Today

Start tracking your mental state. Before you submit a prompt, ask yourself if you're confident it will work. If not, you haven't thought through the problem.

Identify your slowest feedback loop. Whatever takes the longest to verify is killing your productivity. Make speeding it up your next project.

Pick one domain and go deep. Don't try to use LLMs across every technology. Choose an area you know well and let the AI amplify that expertise.

Set up multiple models. If you're locked into one provider's harness, you're leaving capabilities on the table. Use tools that support model switching.

Stop when you're tired. Seriously. The code you generate while exhausted will cost you more time tomorrow than you save today.

The honeymoon phase of LLM-assisted development is over. What's emerging is better: a mature understanding of when these tools work, when they drain you, and how to tell the difference. The developers who figure this out aren't just writing more code. They're building things they couldn't have built before.