The core of an LLM is completely deterministic. The randomness seen in LLM outpu...

The core of an LLM is completely deterministic. The randomness seen in LLM output is purely the result of post processing the output of the pure neural net part of the LLM, which exists explicitly to inject randomness into the generation process.

This what the “temperature” parameter of an LLM controls. Setting the temperature of an LLM to 0 effectively disables that randomness, but the result is a very boring output that’s likely to end up caught in a never ending loop of useless output.