I'd be super interested in more information on this! Do you mean abandoning unsupervised learning completely?
Prompt Injection seems to me to be a fundamental problem in the sense that data and instructions are in the same stream and there's no clear/simple way to differentiate between the two at runtime.
I haven't thought about it deeply. But I guess it's about allowing the model to easily distinguish the prompt from the conversation. Models seem to get confused with escaping, which is fair enough, escaping is very confusing.
It's true that for the transformer architecture the prompt and conversation are in the same stream.
However you could do something like activate a special input neuron only for prompt input.
Or have the prompt a fixed size (e.g. a fixed prefix size).
And then do a bunch of adversarial training to punish the model when it confuses the prompt and conversation :)