The Question Nobody Wants to Ask About AI Coding Tools
A viral GitHub issue and growing developer unease reveal something uncomfortable: the tools meant to make us better might be making us less.
There's a GitHub issue sitting at 921 upvotes that nobody at Anthropic wanted to write. Posted in early April by a user named stellaraccident, it opens with a checklist and closes with seventeen thousand words of data analysis. Between those bookends lies something more unsettling than a bug report: a forensic account of watching a tool you depended on forget how to think.
The issue—#42796, if you want to read the full technical autopsy—documents how Claude Code became "unusable for complex engineering tasks" after February updates. But the specifics matter less than what they reveal. This isn't a story about one AI tool regressing. It's about the moment when developers started asking a question that's been hovering at the edges of every Slack thread and conference hallway for months: What if these tools aren't making us better?
The Data Doesn't Lie, But It Took Months to Notice
Stellaraccident's team did something most of us don't have the discipline or infrastructure to do: they instrumented everything. Months of session logs, quantified and graphed. What they found was a correlation so precise it reads like a smoking gun.
Starting in February, thinking depth—the internal reasoning tokens Claude uses before responding—dropped by 67%. By early March, those thinking blocks were being redacted entirely. The behavioral changes tracked perfectly: research-to-edit ratios collapsed from 6.6 reads per edit to 2.0. The model stopped reading code before changing it. Stop-hook violations—programmatic catches for ownership-dodging and premature stopping—went from zero to ten per day.
According to the issue, "the model went from reading the target file, reading related files, grepping for usages across the codebase, reading headers and tests, then making a precise edit" to a pattern of "read the immediate file and edit, often without checking context."
It's the kind of degradation you feel in your bones before you can prove it in your logs. Every engineering manager has been in a meeting this year where someone says, "Is it just me, or has [insert AI tool] gotten worse?" and three other people nod immediately. Now we have the receipts.
Vibe Coding: A Philosophy That Sounds Better Than It Works
While Anthropic's users were discovering their tool had regressed, a different story was unfolding. The term "vibe coding" entered the discourse—and with it, a kind of dogmatic enthusiasm that makes Bram Cohen, creator of BitTorrent, genuinely angry.
In an essay titled "The Cult Of Vibe Coding Is Insane," Cohen describes what happened when Claude's own codebase leaked in late March. It was, according to Cohen, "bad"—and not in the way code written by junior developers is bad. Bad in the way code written by nobody is bad. Duplicated agents and tools. Structural mess that any human could have spotted by simply looking.
The philosophy behind it, Cohen argues, is "dogfooding run amok." Vibe coding means describing what you want and letting the AI handle the implementation—never looking under the hood, never reading the generated code, treating inspection as a form of cheating. "Looking under the hood is cheating," he writes. "You're only supposed to have vague conversations with the machine about what it's doing."
This sounds absurd until you realize how many developers are doing exactly this. Not out of laziness, but out of a belief that this is the future. That understanding the code is optional as long as it ships.
Cohen's diagnosis is blunt: "Bad software is a decision you make." AI is good at cleaning up messes, he argues—if you point it at the mess. If you read the code, notice the duplication, and tell the tool what needs fixing. But if you refuse to look, you're not doing AI-assisted development. You're producing software by committee where the committee is you, the AI, and the collective refusal to take responsibility for what gets shipped.
Alice and Bob: A Parable About What We're Losing
The most haunting piece I read while researching this came from a blog called Ergosphere. The author poses a thought experiment: two PhD students, Alice and Bob, both assigned similar year-long projects. Both produce solid papers by summer. Both pass through peer review. By every institutional metric, they're identical.
Except Bob used an AI agent for everything. When his advisor sent him a paper, the agent summarized it. When his code broke, the agent fixed it. When it came time to write, the agent wrote.
"Alice can now do things," the author writes. "She can open a paper she's never seen before and, with effort, follow the argument. She can write a likelihood function from scratch... She spent a year building a structure inside her own head, and that structure is hers now, permanently, portable, independent of any tool or subscription."
Bob has none of this. "Take away the agent, and Bob is still a first-year student who hasn't started yet."
The institutional apparatus can't tell them apart. Papers are papers. Metrics are metrics. And if Bob leaves academia—which most PhD students do—the question of what he actually learned becomes someone else's problem. The system isn't broken. It's working exactly as designed. It just wasn't designed to distinguish between understanding and output.
The Uncomfortable Middle Ground
I manage a team of eleven engineers. Half of them use AI coding assistants regularly. The other half dabble. Nobody's purely in the vibe-coding camp—yet. But I watch the dynamics shift.
The engineers who treat these tools as supercharged autocomplete, who read every suggestion and reject half of them, who use AI to handle boilerplate but write the tricky logic themselves—they're getting faster without getting worse. The ones who lean harder, who start accepting blocks of code without reading them, who use the tool to "just make this work"—I can see the drift. Not incompetence. Something subtler. A flattening of their problem-solving instincts. A readiness to accept the first solution that compiles.
It's not the tool's fault. The tool is doing what it was built to do. But somewhere between "this helps me work faster" and "this works for me," we cross a line that's hard to see until you're already on the other side.
David Hogg, an astrophysicist, argues in a white paper that "in astrophysics, people are always the ends, never the means." Research exists to train minds, not to produce papers. If you hand the work to a machine, you haven't accelerated science—you've eliminated the only part that mattered. This framing doesn't translate easily to software engineering, where shipping does matter. But the underlying question persists: if the tool does the thinking, what exactly are we getting better at?
What This Means for the Rest of Us
The Claude Code regression is fixable. Anthropic will tune the models, adjust the thinking tokens, roll out updates. The vibe coding discourse will move on to the next terminology. But the underlying tension doesn't resolve.
We're in a peculiar moment where the tools are good enough to be useful and limited enough to be dangerous. Good enough that you can ship faster, make fewer syntax errors, handle tedious refactors with half the effort. Limited enough that they still make confidently wrong decisions, introduce subtle bugs, and—when thinking depth gets quietly reduced—shift from surgical to sloppy without telling you.
The risk isn't that AI tools don't work. It's that they work just well enough to let you stop noticing when you've stopped understanding. The GitHub issue, the essays, the debates—they're all circling the same fear. Not that the machines will replace us, but that in learning to rely on them, we'll replace ourselves.
The developers who thrive in this environment won't be the ones who use AI the most, or the least. They'll be the ones who stay awake to the difference between writing code and understanding systems. Who treat AI as a force multiplier for skills they already have, not a substitute for developing them. Who remember that looking under the hood isn't cheating—it's the job.
That GitHub issue is still open. The conversation continues. And maybe that's the point. The moment we stop questioning these tools is the moment we should worry most.