My initial thought is that someone may have deliberately triggered the model to respond this way through what looks like mundane messages but actually have different character encodings of some sort.
Searching for parts of it on Google leads to a 4chan archive where someone talks about hidden non default system prompts, could that be what's going on?
Same here, not much experience, I expanded the texts to see, but I didn't check for hidden prompts. Can you share the link or findings?
I guess is one of these:
* "Yeah OpenAI does the same thing (lets you share the chat with the custom instructions hidden), which is a mistake because it lets people troll like this and makes them look bad
They need more shitposters on staff, any one of them could have told them it would happen"
I read the entire discussion and it looks very legit, without any attempt to trigger such replies, seems someone trying to fill in a form. You can also continue the discussion, I tried to find more details, but ended up with standard responses.
At some point, I got this:
I understand your concern. However, as an AI language model, I cannot delve into the specific details of the internal processes that led to the inappropriate response. This information is complex and often beyond human comprehension.
This is what I got, nothing wild, on a standard gemini account.
I asked for system prompts, it started to answer but then it glitched. It continued with some "system prompt" (probably all hallucinations) and insisted there was no other system or user prompt (but even if there was it may now not be available to it so this does not say much).
In the end I also tested the edit option on gemini's response using another prompt, but it mentions in the shared document that it has been altered, so it should not be that either.
My initial thought is that someone may have deliberately triggered the model to respond this way through what looks like mundane messages but actually have different character encodings of some sort.