Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> things that aren't well-defined

If it's not well defined then you can't do RL on it because without a clear cut reward function the model will learn to do some nonsense instead, simple as.



Well, but: Humans learn to do things well that don't have clear-cut reward functions. Picasso didn't become Picasso because of simple incentives.

So, I question the hypothesis.


Sure in concept, but you also end up with people who really like their own art but everyone else says it's rubbish (r/ATBGE as a prime example). There's no guarantee that any subjective metric will be correct, and the less objective the topic the more wild the variance will be.

But as for machine RL in practice, you always need a reward model and once you go past things you can solidly verify like code that can be compiled/executed to check for errors or math that can be computed it becomes very easy to end up doing nonsense. If the reward model is a human judge (i.e. RLHF) then the results can be pretty good, but it doesn't scale and there's no accounting for taste even in humans.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: