Breaking News

Your AI Coding Tool Has Amnesia

https://ift.tt/02SFWp8

I watched one of our engineers explain the same authentication pattern to Claude Code for the fourth time last month. Not because he forgot he’d explained it. Because the tool forgot.

Every session, from scratch. “We use JWT validation at the gateway layer, not in individual services.” He’d said it three days ago. And the week before. And every time he started a new session for the past six months. Each time, the AI nodded along, followed the instructions perfectly, and then forgot everything the moment the session ended.

I kept thinking about this, because it felt like the kind of problem that should already be solved. It’s 2026. These models are genuinely capable. They can reason about complex codebases, debug subtle race conditions, write solid tests. And yet they operate with what I can only describe as aggressive amnesia — a pathological inability to retain anything past the current session.

The autocomplete excuse

This made sense when AI coding tools were autocomplete engines. Copilot circa 2022 was completing single lines of code. The context was one file. Why would it need memory? You type, it suggests, you tab. Session memory is irrelevant.

But that’s not what these tools do anymore. We ask them to build features across multiple files. Debug production issues that require understanding system architecture. Onboard new engineers to unfamiliar codebases. Run autonomously on GitHub issues. And every single time, they start from zero.

I keep coming back to this analogy: imagine hiring a brilliant contractor who shows up every morning with total amnesia. They can code. They’re fast. But every day you spend the first hour explaining the project, the team conventions, the decisions you’ve already made, the mistakes you’ve already learned from. And the next morning? Same thing.

That’s the experience right now. For every team. With every tool.

The things that never make it into code

Here’s what bugs me most. The stuff the AI keeps forgetting isn’t in the code. It’s the stuff that lives between the lines:

Why we chose Postgres over DynamoDB. (Performance for our query patterns, but also because the team has deep Postgres expertise and zero DynamoDB experience.)

Why the notification service is a monolith module and not a microservice. (We tried microservices. It was a disaster. We reverted in Q3 and nobody documented why.)

That the billing pipeline has a known edge case where events get silently dropped under high load. (Two engineers know about this. One of them just gave notice.)

None of this is in a file the AI can scan. Some of it was in a Slack thread from eight months ago. Most of it is in people’s heads. And it’s exactly the kind of context that determines whether the AI’s output is correct or subtly, dangerously wrong.

“Just put it in a config file”

I know what you’re thinking, because I thought it too. CLAUDE.md. .cursorrules. System prompts. Just write it all down in a file and point the AI at it.

We tried. Everyone tries. And it works — for about three weeks, until the file is stale and nobody updates it because updating a config file is maintenance work that doesn’t ship features. The person who wrote the original file has moved on to other things. New decisions get made in conversations that never make it to the file. The file becomes a historical artifact that roughly corresponds to what the team believed at some point in the past.

It’s the wiki problem all over again. Someone creates it with good intentions. It starts decaying immediately. Within six months, developers actively distrust it because they’ve been burned by following outdated information.

“Just use a bigger context window”

The other popular answer. 200K tokens wasn’t enough, so now we have 1M. Just stuff everything in.

I’ve spent a lot of time thinking about this, and I think it fundamentally misunderstands the problem. A bigger context window gives you more room for the current session. It doesn’t give you memory. It doesn’t tell you why the team made a particular architectural decision last quarter. It doesn’t know about the production incident that shaped how the team thinks about error handling. It doesn’t know that Sarah is the only person who understands the reconciliation pipeline.

A bigger window is a bigger scratch pad. The scratch pad still gets erased when the session ends. You haven’t solved amnesia — you’ve given the amnesiac a larger notebook that also gets burned every night.

“Just add more agents”

This one is more recent and it’s the one that gets me. The answer to “the AI doesn’t know enough” is apparently “add more AIs that also don’t know enough, but give each one a narrower job.”

A review agent. A testing agent. A deployment agent. Fifteen specialized agents, each with hardcoded instructions for one task. Someone on the team wrote those instructions. Someone has to maintain them. When the review standards change, someone updates the review agent. When the testing framework changes, someone updates the test agent. It’s the config file problem at a higher level of abstraction, with more moving parts.

And you know what none of those agents know? Anything about your organization. They know what someone hardcoded into their instructions. They don’t know what the team learned last week.

The question that keeps nagging me

Here’s what I keep coming back to: what if the AI just… remembered?

Not the raw conversation transcript. That’s noise. But the actual knowledge — the decisions, the patterns, the mistakes, the conventions — extracted from conversations and available in future sessions. Not just for the engineer who had the conversation, but for the whole team.

An engineer explains why we use event sourcing for the audit system. That explanation becomes a structured knowledge item — available to every other engineer, in every future session, without anyone maintaining a file.

Someone discovers a subtle coupling between two services while debugging. That discovery gets captured. Next time someone touches either service, the AI already knows about the coupling. Not because someone remembered to document it, but because the system was listening when the knowledge was created.

The AI that helped you debug a billing issue on Tuesday starts your Thursday session already knowing what you discovered. The new engineer who joins next month has an AI that knows everything the team has learned in the past year — from day one.

I think about this a lot because it changes what the tool fundamentally is. It stops being a coding assistant and starts being organizational memory. Not a wiki that someone has to maintain. A living knowledge base that grows because people use the tool.

Where this goes

The AI coding tool market is about to split. On one side: tools that help individual engineers write code faster. These are commoditizing. The models get cheaper every quarter. The wrappers get thinner. There’s no durable advantage.

On the other side: tools that make an organization’s collective intelligence available to every engineer, every session, permanently. These don’t exist yet. Not really. Not in a way that actually works.

I’ve been spending the last year thinking about what the second category looks like. How you build it. What the architecture needs to be. Where the industry’s assumptions are wrong.

All of it started from this one observation: your AI tool has amnesia, and nobody seems to think that’s a problem worth solving.

I think it’s the only problem worth solving.

The post Your AI Coding Tool Has Amnesia appeared first on SD Times.



Tech Developers

No comments