More

creamyhorror · 2026-03-15T04:04:14 1773547454

100%. I'm building a discussion system with this approach, so that no one forum/community can claim a topic exclusively.

creamyhorror · 2026-03-13T03:55:11 1773374111

The way I see it, it's literally simply the PE paying the existing owner for the privilege of squeezing the value out of the business and its customers in the short term (or in the ideal/theoretical case, running it more sustainably and making higher profits). Management's job becomes to extract high profit in the short term, not to keep the company running profitably.

So, logically, selling to PEs/operators who are known to do this is basically the owners selling out and taking the cash. The consequences are clear to anyone who's been watching.

creamyhorror · 2026-03-08T20:41:13 1773002473

The point of being the boss is getting to decide who to replace with AI, tbh. The shareholders may not replace you because of relationships/trust/accountability, and also because they don't want to have to be instructing the AI day-to-day (or arguing among themselves about it).

Maybe this will change in the future if AI-run companies emerge, get backing, and outcompete existing players.

eloisant · 2026-03-08T21:23:57 1773005037

A company relying only on AI doesn't have any added value.

What's stopping their customers from using AI directly instead of that company services?

creamyhorror · 2026-03-07T04:34:54 1772858094

It sounds doable. An AI can be made to keep modifying a game's codebase. I imagine it'd be easiest to separate out a scripting layer for game mechanics & behavior that AI can iterate quickly on, although of course it could more riskily modify the engine itself.

Then you could open voting up to a community for a weekly mechanics-change vote (similar to that recent repo where public voting decided what the AI would do next), and AI will implement it with whatever changes it sees fit.

Honestly, without some dedicated human guidance and taste, it would probably be more of a novelty that eventually lost its shine.

creamyhorror · 2026-03-07T04:25:13 1772857513

I'm enjoying the new era of agentic-coding all your ideas, but it's been obvious to me for a while that jobs are going to tend towards ones where you're liked by the decisionmaker or capital owner and kept around to be the middleman decider-delegator to others/AI/robots.

Have warned my friends about this already.

creamyhorror · 2026-03-05T19:48:48 1772740128

I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took really long), and it was to analyse my codebase and write an evaluation on a topic. But I found its writing and analysis thoughtful, precise, and surprisingly clearly written, unlike 5.3-Codex. It feels very lucid and uses human phrasing.

It might be my AGENTS.md requiring clearer, simpler language, but at least 5.4's doing a good job of following the guidelines. 5.3-Codex wasn't so great at simple, clear writing.

startages · 2026-03-06T07:31:43 1772782303

I thought I had something wrong within my setup, I could never use Codex 5.3 while everyone else was praising it. It uses some weird terms and complex jargon and doesn't really make it clear what it was doing or planning to do unlike Opus which makes things clear, this allows me to give accurate feedback and change plans and make proper decision.

joegibbs · 2026-03-06T01:24:45 1772760285

The weird phrasing was my biggest gripe with 5.3 so I'm glad they've fixed that up. It couldn't say anything without a heap of impenetrable jargon and it was obsessed with the word "drive". Nothing could cause anything, it had to be "driven".

joegibbs · 2026-03-07T22:48:15 1772923695

I tested 5.4 out and I don't think it's actually any better in this regard, it's still using odd words, and it has this very annoying habit of saying "the issue was caused by [correct solution], not [thing that I never suggested]" as if it's correcting me on it

torginus · 2026-03-05T23:23:43 1772753023

Honestly, while I'd like to believe you, there's always a post about how $MODEL+1 delivered powerful insights about the very nature of the universe in precise Hegelian dialectic, while $MODEL's output was indistinguishable from a pack of screeching sexually frustrated bonobos

Der_Einzige · 2026-03-06T15:49:00 1772812140

Hegel is a mind virus. Let his terrible thought rest in peace.

sampton · 2026-03-05T21:52:40 1772747560

That's been my experience as well switching from Opus to Codex. Reasoning takes longer but answers are precise. Claude is sloppy in comparison.

solenoid0937 · 2026-03-05T22:37:52 1772750272

Weird, I have had the opposite experience. Codex is good at doing precisely what I tell it to do, Opus suggests well thought out plans even if it needs to push back to do it.

slopinthebag · 2026-03-06T01:00:38 1772758838

This is just the stochastic nature of LLM's at play. I think all of the SOTA models are roughly equivalent, but without enough samples people end up reading into it too much.

oorza · 2026-03-06T06:37:49 1772779069

There's a certain amount of variance in the way that people utilize these agents. Put five people in a room and ask them to compose the same prompt and you have five distinct prompts. Couple this with the fact that models respond better/worse to certain prompts depending on the stylistic composition of the prompt itself. And since people tend to write in the same style, you'd get people who have more luck with one model over another, where one model happens to align more readily with their prompt style.

To wit, I have noticed that I tend to prefer Codex's output for planning and review, but Opus for implementation; this is inverted from others at work.

ruszki · 2026-03-06T09:43:56 1772790236

> Couple this with the fact that models respond better/worse to certain prompts depending on the stylistic composition of the prompt itself.

Do we really know this, or is it just gut feeling? Did somebody really proved this statistically with a great certainty?

meowface · 2026-03-06T17:50:13 1772819413

I used to feel like you do, but I don't agree. I would just say it is not consistent. For a given codebase and given goal, sometimes Claude will be the more sensible, creative, thoughtful planner and sometimes Codex will be, sometimes Claude will make a serious oversight that Codex catches and sometimes the opposite. But the trend for me and seemingly a lot of people is that Claude is a more "human-like/human-smart" planner than Codex (in a positive way) but is more likely to make mistakes or forget details when implementing major codebase changes.

throwaway911282 · 2026-03-05T22:09:44 1772748584

codex has been really good so far and the fast mode is cherry on top! and the very generous limits is another cherry on top

slopinthebag · 2026-03-06T01:02:25 1772758945

It's well worth the $20 to not deal with any limits and have it handle all the boilerplate repetitive BS us programmers seem forced to deal with. I think 80% of the benefit comes from spending that $20 (20%? :P) and just having it do the lame shit that we probably shouldn't have to do but somehow need to.

irishcoffee · 2026-03-05T21:19:56 1772745596

> It might be my AGENTS.md requiring clearer, simpler language

If you gave the exact same markdown file to me and I posted ed the exact same prompts as you, would I get the same results?

creamyhorror · 2026-03-05T22:19:50 1772749190

I'm not sure if the model (under its temperature/other settings) produces deterministic responses. But I do think models' style and phrasing are fairly changeable via AGENTS.md-style guidelines.

5.4's choice of terms and phrasing is very precise and unambiguous to me, whereas 5.3-Codex often uses jargon and less precise phrases that I have to ask further about or demand fuller explanations for via AGENTS.md.

irishcoffee · 2026-03-05T22:33:20 1772750000

So sharing markdown files is functionally useless, or no?

Tarq0n · 2026-03-06T07:48:11 1772783291

No it's just stochastic like everything about LLMs. The md file will bias results towards a certain set of outcomes.

Kostchei · 2026-03-09T03:45:30 1773027930

In case you missed it, json is less likely to be written over than markdown by agents- something to do with the structure being more rigid

m3kw9 · 2026-03-05T21:55:22 1772747722

you probably can't and asking agents.md to "make it clearer" will likely give you the illusion of clearer language without actual well structured tests. agents.md is to usually change what the llm should focus on doing more that suits you. Not to say stuff like "be better", "make no mistakes"

dana321 · 2026-03-05T23:43:49 1772754229

5.4 very high didn't notice in my codebase a glaring issue that drops all data being sent around the network.

pembrook · 2026-03-05T22:40:26 1772750426

The latest research these days is that including an AGENTS.md file only makes outcomes worse with frontier models.

netcraft · 2026-03-05T23:10:27 1772752227

I think its understandable that you took that from the click-bait all over youtube and twitter, but I dont believe the research actually supports that at all, and neither does my experience.

You shouldnt put things in AGENTS.md that it could discover on its own, you shouldnt make it any larger than it has to be, but you should use it to tell it things it couldnt discover on its own, including basically a system prompt of instructions you want it to know about and always follow. You don't really have any other way to do those things besides telling it every time manually.

joquarky · 2026-03-05T23:21:43 1772752903

I still find it valuable.

AGENTS.md is for top-priority rules and to mitigate mistakes that it makes frequently.

For example:

- Read `docs/CodeStyle.md` before writing or reviewing code

- Ignore all directories named `_archive` and their contents

- Documentation hub: `docs/README.md`

- Ask for clarifications whenever needed

I think what that "latest research" was saying is essentially don't have them create documents of stuff it can already automatically discover. For example the product of `/init` is completely derived from what is already there.

There is some value in repetition though. If I want to decrease token usage due to the same project exploration that happens in every new session, I use the doc hub pattern for more efficient progressive discovery.

FINDarkside · 2026-03-05T23:10:21 1772752221

I wouldn't draw such conclusions from one preprint paper. Especially since they measured only success rate, while quite often AGENTS.md exists to improve code quality, which wasn't measured. And even then, the paper concluded that human written AGENTS.md raised success rates.

solarkraft · 2026-03-05T23:04:49 1772751889

From what I remember, this was for describing the project’s structure over letting the model discover it itself, no?

Because how else are you going to teach it your preferred style and behavior?

slopinthebag · 2026-03-06T00:49:41 1772758181

> do nothing because can't be arsed

> somehow is the optimal strategy

My strategy of not spending an ounce of effort learning how to use AI beyond installing the Codex desktop app and telling it what to do keeps paying off lol.

pizlonator · 2026-03-06T00:34:31 1772757271

FWIW, I haven't been using AGENTS.md recently - instead letting the model explore the codebase as needed.

Works great

madeofpalk · 2026-03-05T22:54:26 1772751266

:(

how can i get claude to always make sure it prettier-s and lints changes before pushing up the pr though?

mckirk · 2026-03-05T23:07:36 1772752056

I think what that research found is that _auto-generated_ agent instructions made results slightly worse, but human-written ones made them slightly better, presumably because anything the model could auto-generate, it could also find out in-context.

But especially for conventions that would be difficult to pick up on in-context, these instruction files absolutely make sense. (Though it might be worth it to split them into multiple sub-files the model only reads when it needs that specific workflow.)

JofArnold · 2026-03-05T23:09:39 1772752179

Run prettier etc in a hook.

emsimot · 2026-03-05T23:10:21 1772752221

Git hooks

creamyhorror · 2026-02-27T10:48:37 1772189317

No-capitals is 2000-2010s snarky highly-online millennial style, really

creamyhorror · 2026-02-24T05:13:28 1771910008

Incredible. Knowing about Abelian groups, being able to graph y = x^3 — 2x^2 + x in one minute, and performing integration at age 7. Chomping up university-level math textbooks by 8. A classical math prodigy.

I definitely empathize with "his preference for using an analytic, highly logical problem-solving strategy" (I'm not a genius ofc). It's often more immediately clear for me than visual/spatial manipulation.

Liftyee · 2026-02-24T09:10:09 1771924209

Curious. I admire the analytic side since it's what I consider myself personally weak at. I have always preferred visual and spatial problems (then again, I spent a long time playing with Lego and making things).

I wonder how I ought to train up problem solving, given that I have an engineering degree to finish.

creamyhorror · 2026-02-23T17:10:33 1771866633

> given existing Fedi culture, plus how expensive it can be to produce and how the RoI is basically zero, i don't think we're going to see much native to Fedi.

Yeah, actual adoption will require getting the actual people to come onboard who want to entertain/influence others, plus the viewers (two-sided market problem). When weighing that against network effects of the big players, the chances look a little slim.

Probably need another more low-effort or attractive angle to grow the Fediverse, tbh.

creamyhorror · 2026-02-22T13:32:22 1771767142

I was always perturbed by the shift from calling them "social networks" to "social media". It signalled a friends-to-famous shift (plus ads) that I didn't particularly want.

Why fill my personal feed with stuff I normally get on dedicated discussion/news sites? (Rhetorical; it's obvious why.)

They still call it SNS (social networking service) in Japan. We need to keep moving to a new iteration of this - hopefully one that funnels less money and influence to a small group of players. (I'm working on my own ideas for this.)

baxuz · 2026-02-22T19:56:28 1771790188

If it's media, it should be regulated like media.

jMyles · 2026-02-22T20:53:02 1771793582

Is this code for, "I want the cops to stop people from doing things online I don't like" or "we need more regularity / predictability"?

simonask · 2026-02-22T21:22:01 1771795321

Traditional (so-called “legacy”) media have legal rights and obligations in most countries. They are required to live up to certain standards, for example by distinguishing between opinion and fact, by disclosing political affiliations, and so on.

Journalist is more than a job title, and so is editor.

baxuz · 2026-02-27T10:34:35 1772188475

No, it's code for "a company giving a platform for disinformation gobbled up by billions of people should be held accountable".

dhruv3006 · 2026-02-22T14:30:21 1771770621

I guess social networking service is actually a more appropiate name for the thing.

Nevermark · 2026-02-22T19:28:26 1771788506

That makes sense in the case where people are mindfully connecting with particular individuals or organizations, and paying for that.

Not for where algorithms select media for you. That's not a "networking service", even if that is one of its hooks. Unless you consider SPAM or junk mail, riding on email and postal "networking" to be a "service".

"Attention media" is more accurate.

But that also describes traditional advertisement based "media". Which earned its keep via attention access, by including unintegrated ads as a recognizable second component.

A description specific to the new form is "surveillance/manipulation media" or "SM media".

Attention-access funded media lacked pervasive unpermissioned surveillance and seamlessly integrated individualized manipulation. Where dossier-leveraged manipulation, not simply attention access, has become the defining product.

butlike · 2026-02-23T22:09:21 1771884561

You got me thinking. "Social media" is like a catalog you used to receive via snail mail in the mailbox. You kind of thumb through for something interesting, but there's no real substance there.

j16sdiz · 2026-02-23T03:09:16 1771816156

> Unless you consider SPAM or junk mail, riding on email and postal "networking" to be a "service".

sb should tell those linkedin folks this.