Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  a9284923-141a-434a-bfbb-52de7329861d
  d48d5a68-82cd-4988-b95c-c8c034003cd0
  5c236e02-16ea-42b1-b935-3a6a768e3655
  22e09356-08ce-4b2c-a8fd-596d818b1e8a
  4cb894f7-c3ed-4b8d-86c6-0242200ea333
Amusingly (not really), this is me trying to get sessions to resume to then get feedback ids and it being an absolute chore to get it to give me the commands to resume these conversations but it keeps messing things up: cf764035-0a1d-4c3f-811d-d70e5b1feeef


Thanks for the feedback IDs — read all 5 transcripts.

On the model behavior: your sessions were sending effort=high on every request (confirmed in telemetry), so this isn't the effort default. The data points at adaptive thinking under-allocating reasoning on certain turns — the specific turns where it fabricated (stripe API version, git SHA suffix, apt package list) had zero reasoning emitted, while the turns with deep reasoning were correct. we're investigating with the model team. interim workaround: CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1 forces a fixed reasoning budget instead of letting the model decide per-turn.


Hey bcherny, I'm confused as to what's happening here. The linked issue was closed, with you seeming to imply there's no actual problem, people are just misunderstanding the hidden reasoning summaries and the change to the default effort level.

But here you seem to be saying there is a bug, with adaptive reasoning under-allocating. Is this a separate issue from the linked one? If not, wouldn't it help to respond to the linked issue acknowledging a model issue and telling people to disable adaptive reasoning for now? Not everyone is going to be reading comments on HN.


It's better PR to close issues and tell users they're holding it wrong, and meanwhile quietly fix the issue in the background. Also possibly safer for legal reasons.


Isn’t that what they just did here? Close Stella’s Issue, cross post to hn, then completely sidestep an observation users are making, and attack the analyst of transcripts with a straw man attack blaming… thinking summaries….


There's a 5 hour difference between the replies, and new data that came in, so the posts aren't really in conflict.

Also it doesn't sound like they know "there's a model issue", so opening it now would be premature. Maybe they just read it wrong, do better to let a few others verify first, then reopen.


Love this. Responding to users. Detail info investigating. Action being taken (at least it seems so).


And all hidden in the comments of a niche forum, while the actual issue is closed and whitewashed? You got played.


Surely you realize it's AI responding? (not sure if /s)


I cannot provide the session ids but I have tried the above flag and can confirm this makes a huge amount of difference. You should treat this as bug and make this as the default behavior. Clearly the adaptive thinking is making the model plain stupid and useless. It is time you guys take this seriously and stop messing with the performance with every damn release.


Just set that flag and already getting similar poor results. new one: 93b9f545-716c-4335-b216-bf0c758dff7c


And another where claude gets into a long cycle of "wait thats not right.. hold on... actually..." correcting itself in train of thought. It found the answer eventually but wasted a lot of cycles getting there (reporting because this is a regression in my experience vs a couple weeks ago): 28e1a9a2-b88c-4a8d-880f-92db0e46ffe8


Another 1395b7d6-f2f1-4e24-a815-73852bcdeed2

It fails to answer my initial question and tells me what I need to do to check. Then it hallucinates the answer based on not researching anything, then it incorrectly comes to a conclusion that is inaccurate, and only when I further prompt it does it finally reach a (maybe) correct answer.

I havent submitted a few more, but I think its safe to say that disabling adaptive thinking isnt the answer here


My guess is there isn't enough hardware, so Anthropic is trying to limit how much soup the buffet serve, did I guess right? And I would absolutely bet the enterprise accounts with millions in spend get priority, while the retail will be first to get throttled.


This kind of thing is harder for regular end-users to understand following the change removing reasoning details.


I am curious. Are you able to see our session text based on the session ID? That was big no in some of the tier-1 places I worked. No employee could see user texts.


IIRC for Enterprise, using /feedback or /bug is an exception to the "we promise not to use your data" agreement.


> The data points at adaptive thinking under-allocating reasoning on certain turns

Will you reopen the issue you incorrectly closed, then…? Or are you just playacting concern?


[flagged]


Have you set effort to high or max?


Even with high effort, the adaptive thinking can just choose no thinking. See bcherny's post they were replying to: https://news.ycombinator.com/item?id=47668520


Yeah I know but you can disable it as we saw




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: