More

anuramat · 2026-04-08T03:27:47 1775618867

it's not gonna get much more autonomous without self play and major change in architecture

anuramat · 2026-04-08T03:22:08 1775618528

as much as I hate cc, 95% of the issues there are either AI psychosis or user error

iLoveOncall · 2026-04-08T08:26:33 1775636793

So it should be insanely easy for this world altering model to comb through them and close irrelevant ones.

anuramat · 2026-04-08T08:57:07 1775638627

torturing a model with human stupidity probably doesn't align with their position on model welfare; wondering if they tried bullying it into hacking its way out of the slop gulag

HarHarVeryFunny · 2026-04-08T12:08:34 1775650114

Yes, perhaps it finds it stressful operating on itself.

Maybe that's why they haven't released it - to give it a vacation?

menno-dot-ai · 2026-04-08T09:56:16 1775642176

@anthropic, send me an email if you need access to a jupyter notebook that'd motivate haiku to hack itself into and then back out of the pentagon

HarHarVeryFunny · 2026-04-08T12:07:16 1775650036

So "only" 250 real bugs?

anuramat · 2026-04-08T03:14:48 1775618088

imho it was more reasonable back then to claim "agi soon" -- back when nobody really knew how it scales

phire · 2026-04-08T03:19:17 1775618357

They weren't claiming it was dangerous because "AGI soon", that didn't come until later.

OpenAI were claiming GPT-2 was too dangerous because it could be used to flood the internet with fake content (mostly SEO spam).

And they were somewhat right. GPT-2 was very hard to prompt, but with a bit of effort it could spit out endless pages that were good enough to fool a search engine, and even a human at a first glance (you were often several paragraphs in before you realised it was complete nonsense.

make3 · 2026-04-13T01:52:07 1776045127

we essentially have AGI right now brother

bdangubic · 2026-04-13T01:56:26 1776045386

we got the A and G parts, just missing the I part but it’s coming :)

anuramat · 2026-04-07T18:49:24 1775587764

"some model I don't get to use is much better at benchmarks"

pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit

estearum · 2026-04-07T18:58:57 1775588337

So... you're not excited because it might take a few months before we can use it or something? I don't get your comment.

RivieraKid · 2026-04-07T20:43:53 1775594633

Whether you're excited depends on what do you do for living and how close you are to financial independence.

estearum · 2026-04-07T20:55:39 1775595339

I agree there are other valid reasons not to be excited about this, I just can't make sense of the ones provided above.

randomgermanguy · 2026-04-07T19:20:39 1775589639

I think the general question is if they'll release it at all, haven't yet read anything stating that they would

estearum · 2026-04-07T19:33:12 1775590392

Well let me introduce people to a few brand new concepts:

https://en.wikipedia.org/wiki/Capitalism

https://en.wikipedia.org/wiki/Race_to_the_bottom

https://en.wikipedia.org/wiki/Arms_race

Of course they'll release it once they can de-risk it sufficently and/or a competitor gets close enough on their tail, whichever comes first.

anuramat · 2026-04-08T03:18:37 1775618317

I'm not excited because they might be ~lying

anuramat · 2026-04-07T18:36:50 1775587010

"oops, our latest unreleased model is so good at hacking, we're afraid of it! literal skynet! more literal than the last time!"

almost like they have an incentive to exaggerate

knowaveragejoe · 2026-04-07T18:53:23 1775588003

I'm sure they do, yet the models really are getting scarily good at this. This talk changed my view on where we're actually at:

https://www.youtube.com/watch?v=1sd26pWhfmg

anuramat · 2026-04-07T02:18:45 1775528325

any particular reason you're not using a sandbox?

anuramat · 2026-04-06T15:52:38 1775490758

why would you train a separate model?

eden-u4 · 2026-04-06T18:22:40 1775499760

Guardrailing is usually done with a smaller model (< 1b) to filter out simple "not aligned prompt" and not waste compute.

anuramat · 2026-04-06T20:27:49 1775507269

sure, but here we start with a performance problem, not compute

lukewarm707 · 2026-04-07T08:51:45 1775551905

pretrain it on a bunch of prompt injections and then tune it to return pass/fail

anuramat · 2026-04-06T05:14:20 1775452460

otherwise you end up with "get a $20 subscription for 1000% more value -- equivalent to $200 in API usage!!![1]; [1] -- compared to API pricing for american companies on the first weekend of the month between 18:00 and 22:00 UTC+8 during full moon"

in any case, better than what anthropic does

> user-hostile

credits do expire (I thought they always do?), apparently it's not really up to them: https://news.ycombinator.com/item?id=46230848

anuramat · 2026-04-05T17:20:24 1775409624

from what they wrote, they're just changing how they measure the usage; might even be a good thing if you manage your context right:

> This format replaces average per-message estimates for your plan with a direct mapping between token usage and credits. It is most useful when you want a clearer view of how input, cached input, and output affect credit consumption.

anuramat · 2026-04-05T06:27:20 1775370440

idgi, shitting on the maintainer takes 10x more time than forking the repo

I guess he really needed the latest ci/chore commits