It doesn’t know what it doesn’t know.

fallpeak · 2025-09-13T19:09:40 1757790580

It doesn't know that because it wasn't trained on any tasks that required it to develop that understanding. There's no fundamental reason an LLM couldn't learn "what it knows" in parallel with the things it knows, given a suitable reward function during training.

binarymax · 2025-09-13T18:23:08 1757787788

Well sure. But maybe the token logprobs can be used to help give a confidence assessment.

tyre · 2025-09-13T18:37:08 1757788628

Anthropic has a great paper on exactly this!

https://www.anthropic.com/research/language-models-mostly-kn...

The best is its plummeting confidence when beginning the answer to “Why are you alive?”

Big same, Claude.

smt88 · 2025-09-13T19:09:20 1757790560

That's not true for all types of questions. You've likely seen a model decline to answer a question that requires more recent training data than it has, for example.