Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That whole structured generation line of work looks promising. I hope someone else takes this and runs evaluations on other benchmarks. Curious to see if the results translate!


Agreed! While these results are very promising, there's still a lot to explore in this space.

In addition to the "prompt consistency" and "thought-control" ideas mentioned in the post, I'm definitely curious how the performance is on more complex structured data (things like codegen).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: