More

cdiamand · 2026-02-28T15:54:15 1772294055

Great work OP.

This is super helpful for visual learners and for starting to onboard one's mind into a new domain.

Excited to see where you take this.

Might be interesting to have options for converting Wikipedia pages or topic searches down the line.

jbdamask · 2026-02-28T16:26:52 1772296012

Thank you for the feedback and great ideas

cdiamand · 2026-01-25T13:54:45 1769349285

Linking to the postgresql docs since they are very well written and surprisingly enjoyable to read.

https://www.postgresql.org/docs/current/indexes-intro.html

cdiamand · 2025-10-30T16:58:38 1761843518

This is something I have found missing in my current workflow when reviewing PR's. Particularly in the age of large AI generated PR's.

I think most reviewers do this to some degree by looking at points of interest. It'd be cool if this could look at your prior reviews and try to learn your style.

Is this the correct commit to look at? https://github.com/manaflow-ai/cmux/commit/661ea617d7b1fd392...

lawrencechen · 2025-10-30T17:01:48 1761843708

https://github.com/manaflow-ai/cmux/blob/main/apps/www/lib/s...

This file has most of the logic, the commit you linked to has a bunch of other experiments.

> look at your prior reviews and try to learn your style.

We're really interested in this direction too of maybe setting up a DSPy system to automatically fit reviews to your preferences

cdiamand · 2025-10-30T17:17:38 1761844658

Thank you. This is a pretty cool feature that is just scratching the surface of a deep need, so keep at it.

Another perspective where this exact feature would be useful is in security review.

For example - there are many static security analyzers that look for patterns, and they're useful when you break a clearly predefined rule that is well known.

However, there are situations that static tools miss, but a highlight tool like this could help bring a reviewer's eyes to a high risk "area". I.e. scrutinize this code more because it deals with user input information and there is the chance of SQL injection here, etc.

I think that would be very useful as well.

austinwang115 · 2025-10-30T17:23:39 1761845019

This is a very interesting idea that we’ll definitely look into.

cdiamand · 2025-08-20T12:46:09 1755693969

Looks pretty neat, and certainly addresses a missing element in the current AI workflow.

Question: What happens to our data - i.e. the code and context sent to your service?

josevalim · 2025-08-20T12:49:52 1755694192

We log basic request metadata (timestamps, model used, token counts). Prompts and messages are not logged unless you explicitly opt-in. We don't store tool results. Note the underlying model provider you use may store data separately depending on your user agreement with them.

cdiamand · 2025-03-23T16:33:52 1742747632

Great stuff! You're missing a few bass drum notes in "When the levee breaks"

wesz · 2025-03-23T19:41:36 1742758896

You can click on the pattern, then click on the link below "Create a copy" and add missing bass drums. It's like forking a drum pattern lol.

cdiamand · on Feb 13, 2025

Is there a link to the contract somewhere?

z_ · on Feb 13, 2025

https://sam.gov/opp/bb1ac5870df5485ab090216dc8fe0511/view

FWIW

cdiamand · on Jan 18, 2025

The security section is good to see. Thanks for that!

kai-tub · on Jan 18, 2025

Funny that you mention that. I honestly thought not too many people would care about it.

Though I am by far no security specialist. Please let me know where I can improve the section!

cdiamand · on Jan 14, 2025

I've used LLM's in this capacity, and it's awesome. It quickly becomes a crutch.

cdiamand · on Jan 14, 2025

Last year, I built a DM for myself using the OpenAI api and an elevenLabs voice generator. I asked it set my character in Baldur's Gate, so it could pull upon the huge amount of DnD source material it had been trained on.

A few takeaways:

1. An LLM based DM can give the player essentially infinite richness and description on anything they ask for.

2. There is difficulty in setting the rules for the LLM to follow that match the DnD rulebook. But this is possible to solve for. Also, I found the LLM to be too pliable as a DM. I kept getting my way, or getting my hand held through scenarios. Maybe this is a feature?

3. My conversation quickly began to approach the context window for the LLM and some RAG engineering is very necessary to keep the LLM informed about the key parts of your history.

4. Most importantly, I found that I most enjoy the human connection that I get through DnD and an LLM with a voice doesn't really satisfy that.

int_19h · on Jan 14, 2025

> Also, I found the LLM to be too pliable as a DM. I kept getting my way, or getting my hand held through scenarios. Maybe this is a feature?

LLMs are fine-tuned to be "helpful assistants", so they're basically sycophantic.

meowfly · on Jan 16, 2025

This was my experience too. The short context and the optimism bias make chatgpt the wrong solution.

It starts well and then NPCs become inconsistent and the DM basically lets you craft the story by constantly doing a "yes and".

It becomes boring because the stakes feel so low.

coder543 · on Jan 14, 2025

> My conversation quickly began to approach the context window for the LLM and some RAG engineering is very necessary to keep the LLM informed about the key parts of your history

Assuming we're talking about GPT-4o, that 128k context window theoretically corresponds to somewhere around 73,000 words. People talk at around 100 words per minute in conversation, so that would be about 730 minutes of context, or about 12 hours. The Gemini models can do up to 2 million tokens of context... which we could extrapolate to 11,400 minutes of context (190 hours), which might be enough?

I would say GPT-4o was only good up to about 64k tokens the last time I really tested large context stuff, so let's call that 6 hours of context. In my experience, Gemini's massive context windows are actually able to retain a lot of information... it's not like there's only 64k usable or something. Google has some kind of secret sauce there.

One could imagine architecting the app to use Gemini's Context Caching[0] to keep response times low, since it wouldn't need to re-process the entire session for every response. The application would just spin up a new context cache in the background every 10 minutes or so and delete the old one, reducing the amount of recent conversation that would have to be re-processed each time to generate a response.

I've just never seen RAG work particularly well... and fitting everything into the context is very nice by comparison.

But, one alternative to RAG would be a form of context compression... you could give the LLM several tools/functions for managing the context. The LLM would be instructed to use these tools to record (and update) the names and information of different characters, places, and items that the players encounter, important events that have occurred during the game, as well as information about who the current players are and what items and abilities those players have, and then the LLM would be provided with this "memory" in the context in place of a complete conversational record. The LLM would then just receive (for example) the most recent 15 or 30 minutes of conversation, in addition to that memory.

> I found the LLM to be too pliable as a DM.

I haven't tried using an LLM as a DM, but in my experience, GPT-4o is happy to hold its ground on things. This isn't like the GPT-3.5 days where it was a total pushover for anything and everything. I believe the big Gemini models are also stronger than the old models used to be in this regard. Maybe you just need a stricter prompt for the LLM that tells it how to behave?

I also think the new trend of "reasoning" models could be very interesting for use cases like this. The model could try to (privately) develop a more cohesive picture of the situation before responding to new developments. You could already do this to some extent by making multiple calls to the LLM, one for the LLM to "think", and then another for the LLM to provide a response that would actually go to the players.

One could also imagine giving the LLM access to other functions that it could call, such as the ability to play music and sound effects from a pre-defined library of sounds, or to roll the dice using an external random number generator.

> 4. Most importantly, I found that I most enjoy the human connection that I get through DnD and an LLM with a voice doesn't really satisfy that.

Sure, maybe it's not something people actually want... who knows. But, I think it looks pretty fun.[1]

One of the harder things with this would be helping the LLM learn when to speak and when to just let the players talk amongst themselves. A simple solution could just be to have a button that the players can press when they want, which will then trigger the LLM to respond to what's been recently said, but it would be cool to just have a natural flow.

[0]: https://ai.google.dev/gemini-api/docs/caching

[1]: https://www.youtube.com/watch?v=9oBdLUEayGI

pbronez · on Jan 14, 2025

I did something similar, but tried to get several agents to play a DnD round together. It basically worked, but was insipid.

cdiamand · on Dec 13, 2024

I come back to it from time to time and play for a week. It looks like it's been getting more accessible over time, with a new UI update having been rolled out recently.

There's definitely a certain mindset that helps make it more enjoyable. Imagination being key! From the technical side, the developers have peeled back the curtain on how they've created parts of the game:

https://www.youtube.com/watch?v=U03XXzcThGU&ab_channel=Logo%...

https://www.youtube.com/watch?v=jV-DZqdKlnE&ab_channel=GDC20...