SphericalCowww's comments

SphericalCowww · 2026-04-04T12:08:39 1775304519

Just to understand what you mean, you are worried that there are 2 or more LoRAs that have almost identical content but are distinct only in timestamps. Yeah, that could be a problem...

Hmm..., perhaps the similarity for recency does need to be incorporated through orthogonality, but can somehow be implemented through weight preference, such that the new weight is influenced by the weight history.

Thanks, that is a very good point, will take that into consideration!

sensarts · 2026-04-07T13:33:29 1775568809

Exactly, that's the worry. If two similar updates happen close in time, orthogonality forces them apart for no reason. I like your weight preference idea. Maybe just apply orthogonality when updates are actually different, not when they are nearly the same. Probably you should try that if not tried it already. :)

SphericalCowww · 2026-04-03T21:03:19 1775250199

This is more of a conceptual idea from an AI hobbyist; hopefully, the big claims for motivation are not too distracting. After doing too many basic-level tutorials, this could be an interesting intermediate-level project to apply modern AI architectures. What is your opinion?

SphericalCowww · 2026-03-29T14:15:43 1774793743

I mean, since GPT-4, I believe the RAM is no longer creating the miracle that the LLM performance scales directly with the model size. At least ChatGPT itself convinced me that any decent-sized company can create a GPT4 equivalent in terms of model size, but limited by service options, like memory cache and hallucination handling. Companies buy RAM simply to ride the stock hype.

I am no expert, so this is a shallow take, but I think the global LLM already reaches its limit, and general AGI could only be possible if it's living in the moment, i.e., retraining every minute or so, and associating it with a much smaller device that can observe the surroundings, like a robot or such.

Instead of KV cache, I have an idea of using LoRA's instead: having a central LLM unchanged by learning, surrounded by a dozen or thousands of LoRAs, made orthogonal to each other, each competed by weights to be trained every 1 min say. The LLM, since it's a RNN anyway, provides "summarize what your state and goal is at this moment" and trains the LoRAs with the summary along with all the observations and say inputs from the users. The output of the LoRAs feeds back to the LLM for it to decide the weights for further LoRAs training.

Anyways, I am just thinking there needs to be a structure change of some kind.

redanddead · 2026-03-29T16:23:19 1774801399

share it on gh and make a show hn post about it, maybe you're right

the models are still very stupid atm something needs to change

SphericalCowww · 2026-04-03T11:33:48 1775216028

I put all the conceptual ideas here (spiced with far-fetched claims): https://github.com/SphericalCowww/ML_LunaLoRA

But I think my hn level is too low, would really like some expert's opinion, though, whelp... Meanwhile, let me implement some basic models first.

fittingopposite · 2026-03-30T07:18:29 1774855109

Re continuous fine-tuning: how do you avoid catastrophic forgetting in your proposal?

SphericalCowww · 2026-03-31T09:21:48 1774948908

My understanding is that this is what the LoRAs are for; my belief is that they serve as "memory" to their live observations (a more NN-like cache, say), while the main LLM remains unchanged. These LoRAs are also weighted, so that LoRAs irrelevant to the current task will not be trained, while the relevant LoRAs will be reinforced.

But I never built it, so I am not sure if such an emergent state will appear or not.

SphericalCowww · 2026-04-03T11:34:20 1775216060

I put my ideas here in case you are interested: https://github.com/SphericalCowww/ML_LunaLoRA

SphericalCowww · 2026-03-22T12:34:11 1774182851

Is it me, or are hobby electronic shops much harder to find today, like the one that sells Arduino, basic RCL's, and common IC's? I am not sure if it's just a trend that everything is sold online or if the interest is shifting towards software.

3abiton · 2026-03-22T15:22:34 1774192954

Because china is taking over in that sector, why should I pay triple when I can purchase it straight from the manufacturer. You can find anything electronics related on aliexpress.