He also called permissively licensing Tesla's patents "open sourcing" them. He's...

drexlspivey · on March 17, 2024

The “source” in “open source” refers to source code which they released. A dataset is not source code, if anyone is misusing the term it’s you.

HarHarVeryFunny · on March 18, 2024

If you can't rebuild it, then how can you be considered to have the "source code" ?

The training data isn't a dataset used at runtime - it's basically the source code to the weights.

Not sure it really matters here though (who has the GPUs and desire to retrain Grok?), but just as a matter of definition "open weights" fits better than "open source".

frabcus · on March 17, 2024

I consider the weights a binary program and the source code is the training data. The training algorithm is the compiler.

I agree this isn't standard terminology, but it makes the most sense to me in terms of power dynamics and information flow.

We know from interpretability research that the weights do algorithms eg sin approximation etc. So they feel like binary programs to me.

solarkraft · on March 18, 2024

https://youtu.be/WyTzRnGSlcI?t=88