> Imagine an insurance company giving you a different quote every time you check on their website
It's very disingenuous that the author uses an insurance quote site as an analogy showing an example of their essay grading bot giving different grades to the same paper. The example doesn't need an analogy. A human grading papers would do the same thing if they didn't remember reading the paper.
>> Imagine an insurance company giving you a different quote every time you check on their website
I mean, it's already a well-established practice - maybe not in insurance, but in plenty of other markets. Airlines and ticket booking services do this. E-commerce sites sometimes do this. So it is a weird example indeed.
Yes, and it's bad when humans do it too. Mitigating it when possible is good systems design. Expecting relative determinism is something people have come to expect of computers. It's not some condemnation of Llms, it's just thing you have to keep in mind when using the tool.
The computer, in this case, was instructed to take on a human role.
My point is that if you ask a computer to critique a highly subjective medium, then as a user, this is what I'd expect if I knew that system wasn't allowed to save it's previous responses (for some reason... Maybe bad system design?)
The entire point of taking on a role as a professor isn't to give a final grade. It's to teach what the student could do to make their work better. And the LLM did an excellent job at that.
Maybe that's bad system design, but the model this system is taking on is one in academia.
It's very disingenuous that the author uses an insurance quote site as an analogy showing an example of their essay grading bot giving different grades to the same paper. The example doesn't need an analogy. A human grading papers would do the same thing if they didn't remember reading the paper.