The way that LLMs hallucinate now seems to have everything to do with the way in which they represent knowledge. Just look at the cost function. It's called log likelihood for a reason. The only real goal is to produce a sequence of tokens that are plausible in the most abstract sense, not consistent with concepts in a sound model of reality.
Consider that when models hallucinate, they are still doing what we trained them to do quite well, which is to at least produce a text that is likely. So they implicitly fall back onto more general patterns in the training data i.e. grammar and simple word choice.
I have to imagine that the right architectural changes could still completely or mostly solve the hallucination problem. But it still seems like an open question as to whether we could make those changes and still get a model that can be trained efficiently.
Update: I took out the first sentence where I said "I don't agree" because I don't feel that I've given the paper a careful enough read to determine if the authors aren't in fact agreeing with me.
You can never completely solve the problem because it's mathematically undecideable, which you probably didn't need this preprint to intuit. That said, a better question is whether you can get good enough performance or not.
Consider that when models hallucinate, they are still doing what we trained them to do quite well, which is to at least produce a text that is likely. So they implicitly fall back onto more general patterns in the training data i.e. grammar and simple word choice.
I have to imagine that the right architectural changes could still completely or mostly solve the hallucination problem. But it still seems like an open question as to whether we could make those changes and still get a model that can be trained efficiently.
Update: I took out the first sentence where I said "I don't agree" because I don't feel that I've given the paper a careful enough read to determine if the authors aren't in fact agreeing with me.