> Besides, what is the human brain if not a machine that generates "tokens" that...

breuleux · 2026-02-05T21:03:08 1770325388

The point is that "predicting the next token" is such a general mechanism as to be meaningless. We say that LLMs are "just" predicting the next token, as if this somehow explained all there was to them. It doesn't, not any more than "the brain is made out of atoms" explains the brain, or "it's a list of lists" explains a Lisp program. It's a platitude.

esafak · 2026-02-06T01:10:37 1770340237

It's not meaningless, it's a prediction task, and prediction is commonly held to be closely related if not synonymous with intelligence.

breuleux · 2026-02-06T15:08:08 1770390488

In the case of LLMs, "prediction" is overselling it somewhat. They are token sequence generators. Calling these sequences "predictions" vaguely corresponds to our own intent with respect to training these machines, because we use the value of the next token as a signal to either reinforce or get away from the current behavior. But there's nothing intrinsic in the inference math that says they are predictors, and we typically run inference with a high enough temperature that we don't actually generate the max likelihood tokens anyway.

The whole terminology around these things is hopelessly confused.

holoduke · 2026-02-05T20:31:22 1770323482

Well it's the prediction part that is complicated. How that works is a mystery. But even our LLMs are for a certain part a mystery.

unshavedyak · 2026-02-05T21:14:46 1770326086

I mean.. i don't think that statement is far off. Much of what we do is entirely about predicting the world around us, no? Physics (where the ball will land) to emotional state of others based on our actions (theory of mind), we operate very heavily based on a predictive model of the world around us.

Couple that with all the automatic processes in our mind (filled in blanks that we didn't observe, yet will be convinced we did observe them), hormone states that drastically affect our thoughts and actions..

and the result? I'm not a big believer in our uniqueness or level of autonomy as so many think we have.

With that said i am in no way saying LLMs are even close to us, or are even remotely close to the right implementation to be close to us. The level of complexity in our "stack" alone dwarfs LLMs. I'm not even sure LLMs are up to a worms brain yet.