If we use prompt caching - isn't a largish MCP tools section just like a fixed token penalty in return for higher speed at runtime, because tools don't need to be discovered on demand, and that's the better tradeoff? At least for the most powerful models it doesn't feel like their quality goes down much with a few MCP servers. I might be missing something.
This is interesting. Do you mean this is like chat with your book, or these are books you've already finished reading which you have a query over to ask? And does it search raw book text or metadata?
I add books to Calibre in order to load them on my Kindle. Claw gets the Calibre files, including both metadata.db and full-text-search.db, via Syncthing.
I can ask it to perform metadata updates, since it's a two-way sync, or ask questions about the content.
Kerns (https://kerns.ai) — a research environment for deeply understanding topics across multiple sources. Upload papers, articles, or books into a workspace that persists across sessions. Read with AI summaries that let you zoom in and out of any document. Generate knowledge maps to visualize how ideas connect. Run deep research agents that produce comprehensive, cited reports. Free to use, would love feedback from anyone doing heavy reading/research.
I don't really understand the amount of ongoing negativity in the comments. This is not the first time a product has been near copied, and the experience for me is far superior to code in a terminal. It comes with improvements even though imperfect, and I'm excited for those! I've long wanted the ability to comment on code diffs instead of just writing things back down in chat. And I'm excited for the quality of gemini 3.0 pro; although I'm running into rate limits. I can already tell its something I'm going to try out a lot!
It's not really good for real-life programming though, it invents lot of imaginary things, cannot respect its own instructions, forgets basic things (variable is called "bananaDance", then claims it is "bananadance", then later on "bananaDance" again).
It is good at writing something from scratch (like spitting out its training set).
Claude is still superior for programming and debugging. Gemini is better at daily life questions and creative writing.
Similar stuff my end; I'm coding up a complex feature - Claude would have taken fewer interventions on my part, and would have been non buggy right off the bat. But apart from that the experience is comparable.
Thanks a lot! Thanks for the feedback. URLs already show up as links if the agent decides to do a search and find refs when making the mind map. I'll work on adding images, thanks!
I tried it, I have tried a very similar but still different use case. I wonder if you have thoughts around how much of this is our own context management vs context management for the LLM. Ideally, I don't want to do any work for the LLM; it should be able to figure out from chat what 'branch' of the tree I'm exploring, and then the artifact is purely for one's own use.
Very interesting you bring this up. It was quite a big point of discussion whilst jamie and I were building.
One of the big issues we faced with LLMs is that their attention gets diluted when you have a long chat history. This means that for large amounts of context, they often can't pick out the details your prompt relates to. I'm sure you've noticed this once your chat gets very long.
Instead of trying to develop an automatic system to descide what context your prompt should use (I.e which branch you're on), we opted to make organising your tree a very deliberate action. This gives you a lot more control over what the model sees, and ultimately how good the responses. As a bonus, if a model if playing up, you can go in and change the context it has by moving a node or two about.
Really good point though, and thanks for asking about it. I'd love to hear if you have any thoughts on ways you could get around it automatically.
I also realised I forgot to commend you; I think this is a useful interface! Kudos on building it! I'm working on something very related myself.
I think in general these things should not be confused to be the one and same artifact - that of a personal memory device and that for LLM context management. Right now, it seems to double up, of which the main problem is that it kind of puts the burden on me to manage my memory device, which should be automatic I think. I don't have perfect thoughts on it, so I'll leave it at this, its work in progress..
something im wondering is, suppose you add or remove a chunk of context - what do you do to evaluate whether thata better or not, when the final resulting code or test run might be half an hour or an hpur later?
is the expectation that you will be running many branches of of context at the same time?
>I tried it, I have tried a very similar but still different use case. I wonder if you have thoughts around how much of this is our own context management vs context management for the LLM.
Completely subjectively, for me its both. I have several Chat GPT tabs where it is instructed not to respond, or to briefly summarise. System works both ways imho.
Wondering if there are other similar tools out there which people love, and why ChatGPT/Gemini/Claude won't let you do the same in their native apps.
reply