I feel like the fact that ML has no good explanation why it works this well gives a lot of people room to invent their head-canon, usually from their field of expertise. I've seen this from exceptionally intelligent individuals too. If you only have a hammer...
I think it would be more unusual, and concerning, if an intelligent individual didn't attempt to apply their expertise for a head-canon of something unknown.
Coming up with an idea for how something works, by applying your expertise, is the fundamental foundation of intelligence, learning, and was behind every single advancement of human understanding.
People thinking is always a good thing. Thinking about the unknown is better. Thinking with others is best, and sharing those thoughts isn't somehow bad, even if they're not complete.
Even with LLMs, there's no real mystery about why they work so well - they produce human-like input continuations (aka "answers") because they are trained to predict continuations of human-generated training data. Maybe we should be a bit surprised that the continuation signal is there in the first place, but given that it evidentially is, it's no mystery that LLMs are able to use it - just testimony to the power of the Transformer as a predictive architecture, and of course to gradient descent as a cold unthinking way of finding an error minimum.
Perhaps you meant how LLMs work, rather than why they work, but I'm not sure there's any real mystery there either - the transformer itself is all about key-based attention, and we now know that training a transformer seems to consistently cause it to leverage attention to learn "induction heads" (using pairs of adjacent attention heads) that are the main data finding/copying primitive they use to operate.
Of course knowing how an LLM works in broad strokes isn't the same as knowing specifically how it is working in any given case, how is it transforming a specific input layer by layer to create the given output, but that seems a bit like saying that because I can't describe - precisely - why you had pancakes for breakfast, that we don't know how the brains works.
Don't we all experience this from time to time? When I'm focused on solving some mathematical problems I'm not thinking in words, but in concepts. When you are thinking of words you also think of a concept, the only difference is that sometimes there are no words associated to it. Im my opinion, words, sentences are just a label to the thinking process, a translation of what is really going on inside, not the driver of it.
My experience is mostly with gpt-4. Act like it is a beginner programmer. Give it small, self-contained tasks, explain the possible problems, limitation of the environment you are working with, possible hurdles, suggest api functions or language features to use (it really likes to forget there is a specific function that does half of what you need instead of having to staple multiple ones together).
Try it for different tasks, you will get a feel what it excels in and what it won't be able to solve. If it doesn't give good answer after 2 or 3 attempts, just write it yourself and move on, giving feedback barely works in my experience.
> There were a lot of people who were just reciting the best practice rules they'd learned from blog posts, without really having the experience to know where the advice was coming from, or how best to apply it
This is exactly my experience too. Also, the problem with learning things from youtube and blogs is that whatever the author decides to cover is what we end up knowing, but they never intended to give a comprehensive lecture about these topics. The result is people who dogmatically apply some principles and entirely ignore others - neither of those really work. (I'm also guilty of this in ML topics.)
WPA2 also had an exploit (KRACK) while the handshake algorithm itself was "proven to be secure". Formal verification is a powerful tool but it does not guarantee bug-free code: it merely guarantees that the particular bugs you checked for are not possible.
Based on response times it probably just uses caching for some answers and has limited token/second toward chatgpt so it gives an error sometimes if you ask a question that needs to be forwarded.
I'm running models locally on my 3090 and it's fast enough, although for example building a vector database can take a while. I can run LoRa training but I haven't done anything meaningful with it so far. I chose 3090 because of the cable issue of 4090 (also, no nvlink, although I'm not sure that matters) but it's debatable if my fears are justified. I need to leave the gpu running while I'm away and I just don't feel comfortable doing that with a 4090. I rather take the lower performance.
One caveat though, my asus b650e-f is barely supported by the currently used ubuntu kernel (e.g. my microphone doesn't work, before upgrading kernel + bios I didn't have lan connection...) so expect some problems if you want to use a relatively new gaming setup for linux.
One possible explanation is that Putin wants to prevent a possible rival from winning over oligarchs by promising to make agreement with the eu. Many oligarchs are losing a lot of money on this war and it is likely that a coup that replaces him with a more friendly leader could ease up the sanctions. Without the gas leverage this option is less likely. There are clear signs that Putin is very paranoid of being replaced in the near future.
Several of the Russians I talk to think Igor Sechin is the one maneuvering for a palace coup. There's been a lot of "close Putin allies" who have died since the war started. Sechin runs Rosneft, so if anyone is positioned to "turn the gas back on" it might be him. If he was shopping around the idea "let's get rid of Putin, turn the gas on, and get back to printing money again"....Putin blowing up the pipeline as a necessary evil to kneecap Sechin......that sounds possible.