"I reimplemented GPT-2 from scratch in C++ as an exercise to really understand the nuts and bolts of LLMs. GPT2-117M isn't a super great model, but it's extremely satisfying to get it to generate basically the same thing as other reference implementations."
"I" refers to the guy that wrote this, I, version_five have nothing to do with it, I just thought it looked cool.
Well, I downloaded and compiled it (cool! Thanks!) but no matter what prompt I give it, it just prints out gibberish....where do I go now to learn how to properly use it?
"I reimplemented GPT-2 from scratch in C++ as an exercise to really understand the nuts and bolts of LLMs. GPT2-117M isn't a super great model, but it's extremely satisfying to get it to generate basically the same thing as other reference implementations."
"I" refers to the guy that wrote this, I, version_five have nothing to do with it, I just thought it looked cool.