Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I clicked this thinking “oh, cool, someone finally made a portable version of the Claude.ai* memory system!” Spoiler, no, it’s not it at all, it’s just a “store”/“remember” memory system… as opposed to the Claude.ai memory system, where it doesn’t make the model actively have to write memories on its own, but rather has a model in the background go through your chat history and generate a summary from it.

I’ve found the latter approach to work much, much better than simple “store”/“remember” systems.

So, it just feels misleading to say this can do what Claude.ai’s can do…

(I’ve been looking for a memory system that works the same for a while, so that I can switch away from Claude.ai to something else like LibreChat, but I just haven’t found any. Might be the only thing keeping me on Claude at this point.)

-

*I say Claude.ai because that’s specifically what has the system; Claude Code doesn’t have this system



In the recent Claude Code leak, there was apparently something called "autoDream", a "background memory consolidation engine" according to this: https://kuber.studio/blog/AI/Claude-Code's-Entire-Source-Cod...


I really want to try this approach. I'm curious because this has not been my experience at all. I created https://github.com/flippyhead/ai-brain mostly just for myself and a few friends use it. But so far, telling the AI (via CLAUDE.md) to look for relevant memories and to think about when and how to save them has worked very well. It can create structures based on decided priorities, notes for the future, that feel like they'd be very different if it was just trying to summarize everything.


I use Claude code hooks to prompt and store memories. It’s taken a lot of iterations mostly on the definition of “significant” events being stored in memory. Indeed, it works very well now but I’m hesitant to start from scratch on some guys tool. I think demos are going to need reviews here on out. Vibe coded projects look too legit but it’s a waste of time to test the 100 that come out each day


I hear you. I've been slowly building up my own tool (linked above) and keep feeling like someone is going to soon release something that a lot of people will agree should be an independent standard. I'm reluctant to host it with someone else so it needs to be opensource. But then again what I've got is working well for me.


The biggest issue for me is recalling during conversation context, not jotting information down. I've solved this by including a tag for when to nudge the agent to recall something.

ie: "$recall words"

it works but its clunky


Shameless drop: My own agentic environment is also using a summarizer to sum up agent histories when they overflow in their context windows. Additionally, all short lived agents are based on requirements (WIP) as a centralized point for code, unit tests, and architecture.

(Everything is tailored to Go as a language)

Works pretty good so far, the user only interacts with the planner. I'm working atm on the requirements to have a spec driven workflow. Web UI is the most polished atm because of ability to have agent tabs on the side for better overview.

In case anyone is interested in this attempt:

[1] https://github.com/cookiengineer/exocomp


I favor automatic recall, invisible to the agent. For memory creation, I find tool calls do a pretty good job, though I also like automatic memory creation on context compression.

I think with automatic creation you need async consolidation (calling it dreaming is a little dramatic for my taste).

My implementation is at Elroy.bot, I recently wrote about different approaches to agent memory here: https://tombedor.dev/approaches-to-agent-memory/


That's an interesting concept. So it's like if you're an agent chatting with a user, you have an army of assistants who overhear the conversation and record important facts, or search relevant facts on some database and decide on the fly when to interrupt you with "this memory X looks relevant". Sounds easy enough if tokens were free, but an interesting problem to do it efficiently.


Burst-parallel non-frontier models can resemble "tokens were free". And there one might potentially augment not just conversations, but CoT - retroactively by submitting messages with altered reasoning strings, or inline with the inference loop watching CoT and attempting non-distracting injection.


Simple vector similarity plus a cheap model to filter results works pretty well. Though ofc t does add tokens to your primary chat, which is the basic tradeoff of memory systems in general (in addition to latency)


That's exactly what claude-code does these days. If you AFK for ~5 minutes it also produces a summary of where you are, which is useful if you're juggling multiple windows.


How do you benchmark that? The problem with extracting memories in the background is it's hard to make that work with the prefix cache. You can go a long way with a simple 2-stage LOG.md (detailed log of tasks & lessons) + MEMORY.md (log items that are promoted when the log itself gets truncated) + a stop hook to ensure this runs at the end of a turn.


I agree. silent agent doing agentic things async is what would be helpful, not requiring a modification to the main prompt


Yeah. The other advantage is a summary-based memory also just… “pieces together” things that a “store”/“remember” memory wouldn’t, because they’re things that the actual main agent would not think to store. i.e. small disconnected things across conversations that alone, would not end up in memory because they’re insignificant. But when there’s an agent looking at multiple conversations at once it can actually reason about this stuff and piece it together.


Just write a hook that runs claude -p after whatever you want and update whatever memory system you want. You can use a channel to inject back what topics were update or what have you.


I am not sure how using Claude -p is going to help you imitate Claude’s memory system for any ai agent…


Sure, but the idea is not to have this in Claude Code, the idea is to be able to use something like LibreChat with proper memory. I don’t really need that good of a memory system for my coding agent, it’s definitely more something I need for my chat agent.


Which is weird because if I remember correctly, there are summaries already generated by Claude on your hard drive of things you have done in the past.


Oh my pi does it. And it does it really well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: