This is the move right here. And it doesn't have to be a GPT-4 I-know-absolutely-everything model. I think they could train a <1B param model that is capable of being cute and interactive.
Dialogue is the area where the interactivity of video games falls down completely. I mean, dialogue in a video game can be just as good as the dialogue in a movie but it has to be "on rails" because of the lack of linguistic competence of today's computers.
Even with a much better system it has to be "on rails" in that a video game character will get in trouble if you get it to enough an extended conversation which can be somewhat answered by "people have various ways to set boundaries" and that that of course can be part of the characterization.
Y'all are probably sick of me talking about how Tamamo-no-mae in the game Fate/Extella is the pinnacle of characterization in modern games, but being based on a legendary character who is able to charm people by talking intelligently about any subject who is connected to a photonic crystal computer that has recorded all the Earth's history and having the relationship she does with the protagonist any answer she has to why she doesn't realize her promise would be terribly disappointing.
Video games have the separate complication, though, that dialogue needs to holistically work with the rest of the plot. There's no such requirement for a very fancy Furby.
I've wanted an evil Teddy Ruxpin that is really evil and wants to lower your utility function and tries really hard to do it (violate the first law of Robotics) but just isn't strong enough to really hurt you. I figure though the bar for creating a catastrophe is not really that high, all it has to do is start a fire.
I am not sure about that! It seems like small models are emerging that are a bit more specific but can be very small, and thus have much lower latency. For example: https://arxiv.org/abs/2305.07759
This might be worked around with a characteristic verbal affectation that makes their utterances initially long-winded. Once the LLM's lag is passed, the proper content may begin to flow. Thinking fast and slow.