Claude, please stop trying to memorize random crap
129 points - today at 3:32 PM
SourceComments
> I believed this so strongly that my company built an entire product around this concept. I used to tell folks that "session transcripts were the new oil," that they were more valuable than the code itself.
> [ā¦]
> We don't really write code by hand anymore.
Honestly, isn't this just influencer spam? What possible value is there in reading about people who used to have products, but no longer write their own code, complaining about the inscrutable prediction machine they have handed that job and their livelihoods to?
Like, if you have complaints about the thing, perhaps you should address them to your supplier directly. None of your readers can help, and nobody's magic folk solution to your problem is better than yours.
And there are so many of these sorts of posts. Are we not entirely cooked?
(I think I have concluded that if people writing about AI aren't writing about interesting things they have achieved with small, local LLMs ā which for clarity I am fully interested in reading - then I'm done reading. This whole blogging-about-cloud-AI genre is just weird and irresponsible now)
It'll assume I own a datacenter and have lots of gpus just because I asked to research things.
My guess is that has something to do with the training process leaving models unable to differentiate between āwhatās happening nowā and āwhat happened beforeā. Perhaps if making inferences from memories was actually part of the training process things would be different but my sense is that as an inference-time-only feature this just gets the models confused.
Its certainly true at the moment, but give it 10 years and we might have systems that are much cheaper and much better at context management than they are now.
(Apologies to anyone who is under the impression that we were very likely going to be at the singularity in 10 years time. Possible != very likely)
Now, I'll agree that this is probably the sort of thing I should put in the CLAUDE.md, but in this case it wasn't on my radar to put that in my CLAUDE.md, so it was nice that it surfaced that.
It does sometimes go awry though. Today I was asking about a problem I was having authenticating, and it said "you may be running into this trusted proxy setting because you put your apps behind an haproxy". That is true of 95% of our apps, so it was worth mentioning, but in this case it was not so I had to correct it. But, I'm glad it mentioned it because if we did have it proxied it could have saved me a lot of time.
It is like an annoying friend, who remembers something from a past conversation, that you have grown and developed from, but they still want to hold it against you.
Session logs can absolutely be useful, but not when building further. It's just that that the place they slot in is during validation. You know, that place between the markdown plan and CI passing, where there's 800 new lines of code and it all seems sort of fine when you click around?
Session logs can show you what sort of manual validation happened. CI will run the tests you had, and the code will show you what new unit tests were added, but session logs can show you that the agent drove the app with Playwright, or that the agent read and considered the prod config as well as the dev config.
Nothing bulletproof, but not every piece of validation work merits a test in the repo that lives forever. We've gotten a lot of mileage out of re-analyzing the sessions, figuring out where the agent made decisions without asking, and forcing the agent to consider validation for those decisions. That's the sort of thing that's hard to dictate up front but easy to highlight with the session logs.
ācompare these three cars. Oh btw I am a data engineer, and my moms maiden name is Joana, and I am allergic to bad poetry. And code should be DRY, I prefer SQL over Python and whatās the most poisonous flower in Scandinavia?ā.
Iāve had so much wierd output because context is āāāmemorizedāāā and bleeding into completely unrelated projects and conversations. Itās the first feature I turn off.
I agree with other commenters here, if anything is worth being rememebered, it will be in code comments, git commit messages, CLAUDE.md or other formal documentation. The auto memory system just causes confusion and leaves stale and outdated information written down.
Its an interesting thought experiment as well, I originally thought that having the model write down memory files by itself would be a nice addition, but after playing around with it, it became clear to me that good as an idea turns out bad in practice because the model can't correctly gauge what deserves being stored as a memory.
This is infuriatingly common wrt talking/writing about how to use AI effectively. All of the "this is how you write an AGENTS.md" and "you need to talk to it like X to optimize it". Like sure, you can believe that as much as you want but unless you provide some evidence you can keep your shitty CLAUDE.md to yourself and don't pollute the whole company's git repo, thanks.
> Don't start generating an auto-memory entry before asking me. Ask first, write only if I confirm ā no speculative drafting.
No more crap after this.
Incidentally I donāt recall Opus 4.8 asking me once in the past few weeks. Older models did ask semi-frequently.
Toggle it off and never think about it again.
I refuse to believe this is true. The ability for an agent to find information from before a compaction is incredibly useful. At compaction time it's impossible to know what exactly may be still needed.
Not that this isolated article is super damning or anything, but the accumulated set of all these reports has left me only empathetic, I think, of these other devs. Like, I just want to tell them, "it can be ok, it doesn't need to be like this.."
The software world is very close to building a super intelligent senior software developer. Companies like this will ask all the best things a software engineer does automatically. Now claude will add it into the coding agents itself.
Damn, I didn't see this coming.
Its first the build the intelligent builder. We will figure out what we want to build later.
Edit: Before more people take it seriously. This is sarcasm. I don't wish this.