> things that aren't well-defined If it's not well defined then you can't do RL ...

adamc · on Feb 7, 2025

Well, but: Humans learn to do things well that don't have clear-cut reward functions. Picasso didn't become Picasso because of simple incentives.

So, I question the hypothesis.

moffkalast · on Feb 8, 2025

Sure in concept, but you also end up with people who really like their own art but everyone else says it's rubbish (r/ATBGE as a prime example). There's no guarantee that any subjective metric will be correct, and the less objective the topic the more wild the variance will be.

But as for machine RL in practice, you always need a reward model and once you go past things you can solidly verify like code that can be compiled/executed to check for errors or math that can be computed it becomes very easy to end up doing nonsense. If the reward model is a human judge (i.e. RLHF) then the results can be pretty good, but it doesn't scale and there's no accounting for taste even in humans.