The way I see it, it's literally simply the PE paying the existing owner for the privilege of squeezing the value out of the business and its customers in the short term (or in the ideal/theoretical case, running it more sustainably and making higher profits). Management's job becomes to extract high profit in the short term, not to keep the company running profitably.
So, logically, selling to PEs/operators who are known to do this is basically the owners selling out and taking the cash. The consequences are clear to anyone who's been watching.
The point of being the boss is getting to decide who to replace with AI, tbh. The shareholders may not replace you because of relationships/trust/accountability, and also because they don't want to have to be instructing the AI day-to-day (or arguing among themselves about it).
Maybe this will change in the future if AI-run companies emerge, get backing, and outcompete existing players.
It sounds doable. An AI can be made to keep modifying a game's codebase. I imagine it'd be easiest to separate out a scripting layer for game mechanics & behavior that AI can iterate quickly on, although of course it could more riskily modify the engine itself.
Then you could open voting up to a community for a weekly mechanics-change vote (similar to that recent repo where public voting decided what the AI would do next), and AI will implement it with whatever changes it sees fit.
Honestly, without some dedicated human guidance and taste, it would probably be more of a novelty that eventually lost its shine.
I'm enjoying the new era of agentic-coding all your ideas, but it's been obvious to me for a while that jobs are going to tend towards ones where you're liked by the decisionmaker or capital owner and kept around to be the middleman decider-delegator to others/AI/robots.
I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took really long), and it was to analyse my codebase and write an evaluation on a topic. But I found its writing and analysis thoughtful, precise, and surprisingly clearly written, unlike 5.3-Codex. It feels very lucid and uses human phrasing.
It might be my AGENTS.md requiring clearer, simpler language, but at least 5.4's doing a good job of following the guidelines. 5.3-Codex wasn't so great at simple, clear writing.
I thought I had something wrong within my setup, I could never use Codex 5.3 while everyone else was praising it. It uses some weird terms and complex jargon and doesn't really make it clear what it was doing or planning to do unlike Opus which makes things clear, this allows me to give accurate feedback and change plans and make proper decision.
The weird phrasing was my biggest gripe with 5.3 so I'm glad they've fixed that up. It couldn't say anything without a heap of impenetrable jargon and it was obsessed with the word "drive". Nothing could cause anything, it had to be "driven".
I tested 5.4 out and I don't think it's actually any better in this regard, it's still using odd words, and it has this very annoying habit of saying "the issue was caused by [correct solution], not [thing that I never suggested]" as if it's correcting me on it
Honestly, while I'd like to believe you, there's always a post about how $MODEL+1 delivered powerful insights about the very nature of the universe in precise Hegelian dialectic, while $MODEL's output was indistinguishable from a pack of screeching sexually frustrated bonobos
Weird, I have had the opposite experience. Codex is good at doing precisely what I tell it to do, Opus suggests well thought out plans even if it needs to push back to do it.
This is just the stochastic nature of LLM's at play. I think all of the SOTA models are roughly equivalent, but without enough samples people end up reading into it too much.
There's a certain amount of variance in the way that people utilize these agents. Put five people in a room and ask them to compose the same prompt and you have five distinct prompts. Couple this with the fact that models respond better/worse to certain prompts depending on the stylistic composition of the prompt itself. And since people tend to write in the same style, you'd get people who have more luck with one model over another, where one model happens to align more readily with their prompt style.
To wit, I have noticed that I tend to prefer Codex's output for planning and review, but Opus for implementation; this is inverted from others at work.
I used to feel like you do, but I don't agree. I would just say it is not consistent. For a given codebase and given goal, sometimes Claude will be the more sensible, creative, thoughtful planner and sometimes Codex will be, sometimes Claude will make a serious oversight that Codex catches and sometimes the opposite. But the trend for me and seemingly a lot of people is that Claude is a more "human-like/human-smart" planner than Codex (in a positive way) but is more likely to make mistakes or forget details when implementing major codebase changes.
It's well worth the $20 to not deal with any limits and have it handle all the boilerplate repetitive BS us programmers seem forced to deal with. I think 80% of the benefit comes from spending that $20 (20%? :P) and just having it do the lame shit that we probably shouldn't have to do but somehow need to.
I'm not sure if the model (under its temperature/other settings) produces deterministic responses. But I do think models' style and phrasing are fairly changeable via AGENTS.md-style guidelines.
5.4's choice of terms and phrasing is very precise and unambiguous to me, whereas 5.3-Codex often uses jargon and less precise phrases that I have to ask further about or demand fuller explanations for via AGENTS.md.
you probably can't and asking agents.md to "make it clearer" will likely give you the illusion of clearer language without actual well structured tests. agents.md is to usually change what the llm should focus on doing more that suits you. Not to say stuff like "be better", "make no mistakes"
I think its understandable that you took that from the click-bait all over youtube and twitter, but I dont believe the research actually supports that at all, and neither does my experience.
You shouldnt put things in AGENTS.md that it could discover on its own, you shouldnt make it any larger than it has to be, but you should use it to tell it things it couldnt discover on its own, including basically a system prompt of instructions you want it to know about and always follow. You don't really have any other way to do those things besides telling it every time manually.
AGENTS.md is for top-priority rules and to mitigate mistakes that it makes frequently.
For example:
- Read `docs/CodeStyle.md` before writing or reviewing code
- Ignore all directories named `_archive` and their contents
- Documentation hub: `docs/README.md`
- Ask for clarifications whenever needed
I think what that "latest research" was saying is essentially don't have them create documents of stuff it can already automatically discover. For example the product of `/init` is completely derived from what is already there.
There is some value in repetition though. If I want to decrease token usage due to the same project exploration that happens in every new session, I use the doc hub pattern for more efficient progressive discovery.
I wouldn't draw such conclusions from one preprint paper. Especially since they measured only success rate, while quite often AGENTS.md exists to improve code quality, which wasn't measured. And even then, the paper concluded that human written AGENTS.md raised success rates.
My strategy of not spending an ounce of effort learning how to use AI beyond installing the Codex desktop app and telling it what to do keeps paying off lol.
I think what that research found is that _auto-generated_ agent instructions made results slightly worse, but human-written ones made them slightly better, presumably because anything the model could auto-generate, it could also find out in-context.
But especially for conventions that would be difficult to pick up on in-context, these instruction files absolutely make sense. (Though it might be worth it to split them into multiple sub-files the model only reads when it needs that specific workflow.)
Incredible. Knowing about Abelian groups, being able to graph y = x^3 — 2x^2 + x in one minute, and performing integration at age 7. Chomping up university-level math textbooks by 8. A classical math prodigy.
I definitely empathize with "his preference for using an analytic, highly logical problem-solving strategy" (I'm not a genius ofc). It's often more immediately clear for me than visual/spatial manipulation.
Curious. I admire the analytic side since it's what I consider myself personally weak at. I have always preferred visual and spatial problems (then again, I spent a long time playing with Lego and making things).
I wonder how I ought to train up problem solving, given that I have an engineering degree to finish.
> given existing Fedi culture, plus how expensive it can be to produce and how the RoI is basically zero, i don't think we're going to see much native to Fedi.
Yeah, actual adoption will require getting the actual people to come onboard who want to entertain/influence others, plus the viewers (two-sided market problem). When weighing that against network effects of the big players, the chances look a little slim.
Probably need another more low-effort or attractive angle to grow the Fediverse, tbh.
I was always perturbed by the shift from calling them "social networks" to "social media". It signalled a friends-to-famous shift (plus ads) that I didn't particularly want.
Why fill my personal feed with stuff I normally get on dedicated discussion/news sites? (Rhetorical; it's obvious why.)
They still call it SNS (social networking service) in Japan. We need to keep moving to a new iteration of this - hopefully one that funnels less money and influence to a small group of players. (I'm working on my own ideas for this.)
Traditional (so-called “legacy”) media have legal rights and obligations in most countries. They are required to live up to certain standards, for example by distinguishing between opinion and fact, by disclosing political affiliations, and so on.
Journalist is more than a job title, and so is editor.
That makes sense in the case where people are mindfully connecting with particular individuals or organizations, and paying for that.
Not for where algorithms select media for you. That's not a "networking service", even if that is one of its hooks. Unless you consider SPAM or junk mail, riding on email and postal "networking" to be a "service".
"Attention media" is more accurate.
But that also describes traditional advertisement based "media". Which earned its keep via attention access, by including unintegrated ads as a recognizable second component.
A description specific to the new form is "surveillance/manipulation media" or "SM media".
Attention-access funded media lacked pervasive unpermissioned surveillance and seamlessly integrated individualized manipulation. Where dossier-leveraged manipulation, not simply attention access, has become the defining product.
You got me thinking. "Social media" is like a catalog you used to receive via snail mail in the mailbox. You kind of thumb through for something interesting, but there's no real substance there.
reply