More

godelski · 2026-02-23T23:46:24 1771890384

  > Way too many rewrites fail because people try to "improve" things during the port

I'd say that porting is a great time to "improve" many things, but like you suggest, not a great time to add new features. You can do a lot of improvements while maintaining output parity. You're in the weeds, reading the code, thinking about the routines, and you have all the hindsight of having done it already. Features are great to add as comments that sketch things out but importantly this is a great time to find and recognize that maybe a subroutine is pretty inefficient. I mean the big problem in writing software is that the goal are ever evolving. You wrote the software for different goals, different constraints. So a great time to clean things up, make them more flexible, more readable, *AND TO DOCUMENT*.

I think the last one gets ignored easily but my favorite time to document code is when reading it (but the best time is when writing it). It forces you to think explicitly about what the code is doing and makes it harder for the little things to slip by. Given that Ladybird is a popular project I really do think good documentation is a way to accelerate its development. Good documentation means new people can come in and contribute faster and with fewer errors. It lowers the barrier to entry, substantially. It's also helpful for all the mere mortals who forget things

JuniperMesos · 2026-02-24T00:35:20 1771893320

LLMs are great at producing documentation - ask one "hey can you add a TODO comment about the thing on line 847 that is probably not the best way to do this?" while you're working on the port, and it will craft a reasonably-legible comment about that thing without further thought from you, that will make things easier for the person (possibly future-you) looking at the codebase for improvements to make. Meanwhile, you keep on working on the port that has byte-for-byte identical output.

godelski · 2026-02-23T21:37:32 1771882652

I used to work as a physical engineer and a common task is "where's that tool?" People leave things at their work station and they float around and well... you can't keep track of things you can't see.

Manager finally got fed up (yes, he was the biggest offender lol) and we organized the whole shop. Gave every tool a specific place. Required tools to be put back. But it actually became easier to put back because everything had a home and we made it so their home was accessible (that's the trick).

Took us like a week to do and it's one of those things that seemed useless. But no one had any doubts of the effectiveness of this because it'd be really difficult to argue that we didn't each spend more than a week (over a year) searching for things. Not only that, it led to fewer lost and broken tools. It also just made people less frustrated and led to fewer arguments. Maybe most important of all, when there was an emergency we were able to act much faster.

So that's changed my view on organizing. It's definitely a thing that's easy to dismiss and not obviously worth the investment but even in just a year there's probably a single event that is solved faster due to the organization. The problem is you have to value the money you would have lost were you not organized. It's invisible and thus easy to dismiss. It's easier because everything else seems so important. But there's always enough time to do things twice and never enough time to do it right.

godelski · 2026-02-23T21:22:56 1771881776

I think some people frame Yak shaving as a bad thing but I'm not sure it always is, and often even it is it is bad because you're resolving debt.

The example with Hal is funny, repeatable (I share it frequently), but also the tasks are (mostly) independent. It feels more like my ADHD. They're things that need to get done, easy to put off/triage, and but make doing other tasks difficult so maybe they actually shouldn't be put off?

But there's also the classic example we're doing something is a bigger rabbit hole than expected. Usually because we were too naïve and oversimplified the problem. An old manager gave me a good rule of thumb: however long you think something is going to take, multiply it by 3. Honestly I think that number is too low and most people miss the mark. I'm pretty sure he stole it from Scotty from Star Trek but forgot that even that is fantasy.

Personal I think you have to be careful about putting off the little things. It's a million times easier to solve little problems than big ones. So you have to remember that just because it's a little problem now doesn't mean it'll grow. The danger is that it's little, so you forget about it. The shitty part is that if you tell you boss they get upset at you if you solve it now but you look like a genius if you solve it after it festers. Invisible work...

https://scifi.stackexchange.com/questions/99114/source-of-sc...

godelski · 2026-02-22T21:22:41 1771795361

Depends what your research question is, but it's very easy to spoil your experiment.

Let's say you tell it that there might be small backdoors. You've now primed the LLM to search that way (even using "may"). You passed information about the test to test taker!

So we have a new variable! Is the success only due to the hint? How robust is that prompt? Does subtle wording dramatically change output? Does "may", "does", "can", "might" work but "May", "cann", or anything else fail? Have you the promoter unintentionally conveyed something important about the test?

I'm sure you can prompt engineer your way you greater success but by doing so you also greatly expand the complexity of the experiment and consequently make your results far less robust.

Experimental design is incredibly difficult due to all the subtleties. It's a thing most people frequently fail at (including scientists) and even more frequently fool themselves into believing stronger claims than the experiment can yield.

And before anyone says "but humans", yeah, same complexity applies. It's actually why human experimentation is harder than a lot of other things. There's just far more noise in the system.

But could you get success? Certainly. I mean you could tell it exactly where the backdoors are. But that's not useful. So now you got to decide where that line is and certainly others won't agree.

godelski · 2026-02-22T21:06:09 1771794369

It's a pretty common threshold, like 10% is. Be it the 80/20 "Pareto" rule, it's the value of one finger on one hand, or if you really want you stretch the p-value of 0.05 is 1 in 20 odds but that's definitely a stretch though arbitrary anyways. But 20 is a very human number and very common. It's just a division of 5 rather than 4 (I'm assuming you wouldn't have questioned a cutoff at 25%)

greazy · 2026-02-22T22:18:27 1771798707

You've missed my point, it's not the thresholds, it's the categories assigned to the thresholds that need explaining.

godelski · 2026-02-23T04:45:58 1771821958

You're right and I still don't understand your point.

godelski · 2026-02-21T23:34:30 1771716870

  > built out of redstone in minecraft

Ummm...

  > 5 million parameters

Which is a lot more than 888kb... Supposing your ESP32 could use qint8 (LOL) that's still 1 byte per parameter and the k in kb stands for thousand, not million.

Dylan16807 · 2026-02-22T02:01:15 1771725675

https://www.youtube.com/watch?v=VaeI9YgE1o8

Yes I know how much a kilobyte is. But cutting down to 2 million 3 bit parameters or something like that would definitely be possible.

And a 32 bit processor should be able to pack and unpack parameters just fine.

Edit: Hey look what I just found https://github.com/DaveBben/esp32-llm "a 260K parameter tinyllamas checkpoint trained on the tiny stories dataset"

godelski · 2026-02-22T03:28:28 1771730908

  > But cutting down to 2 million 3 bit parameters or something like that would definitely be possible.

Sure, but there's no free lunch

  > Hey look what I just found

I've even personally built smaller "L"LMs. The first L is in quotes because it really isn't large (So maybe lLM?) and they aren't anything like what you'd expect and certainly not what the parent was looking for. The utility of them is really not that high... (there are special cases though) Can you "do" it? Yeah. I mean you can make a machine learning model of essentially arbitrary size. Will it be useful? Obviously that's not guaranteed. Is it fun? Yes. Is it great for leaning? Also yes.

And remember, Tiny Stories is 1GB of data. Can you train it for longer and with more data? Again, certainly, BUT again, there are costs. That Minecraft one is far more powerful than this thing.

Also, remember that these models are not RLHF'd, so you really shouldn't expect it to act like you're expecting a LLM to work. It is only at stage 0, the "pre-training", or what Karpathy calls a "babbler".

Dylan16807 · 2026-02-22T07:10:42 1771744242

A reminder that what I said was "not completely impossible, depending on what your expectations are"

And I was focused more on the ESP32 part than the exact number of bytes. As far as I'm concerned you can port the model from the minecraft video and you still win the challenge.

Also, that last link isn't supposed to represent the best you can do in 800KB. 260k parameters is way way under the limit.

godelski · 2026-02-22T08:12:36 1771747956

That bar has no lower bound though so of course we're talking past one another

Also we're talking about an esp32. They aren't magic

Dylan16807 · 2026-02-22T09:45:14 1771753514

Being able to talk back and forth with coherent sentences has a lower bound, and it's close to the limit of this hardware.

Something that can actually be an "assistant" has its own lower bound, probably a little harder but mostly a matter of training it differently.

godelski · 2026-02-21T23:29:07 1771716547

Depends what you mean.

If you mean something that calls a model that you yourself host, then it's just a matter of making the call to the model which can be done in a million different ways.

If instead you mean running that model on the same device as claw, well... that ain't happening on an ESP32...

I think if you are capable of setting up and running a locally hosted model then I'd guess the first option needs no explanation. But if you're in the second case I'd warn you that your eyes are bigger than your mouth and you're going to get yourself into trouble.

godelski · 2026-02-21T23:24:33 1771716273

Wow, the rare

  bash <(curl foo.sh)

pattern. As opposed to the more common

  curl foo.sh | bash

Equivalent but just as unsafe. If you must do this instead try one of these

  # Gives you a copy of the file, but still streams to bash
  curl foo.sh | tee /tmp/foo.sh | bash
  # No copy of file but ensures stream finishes then bash runs
  bash -c "$(curl foo.sh)"
  # Best: Gives copy of file and ensures stream finishes
  curl foo.sh -o /tmp/foo.sh && bash $_

I prefer the last one

rgoulter · 2026-02-22T04:00:19 1771732819

> Equivalent but just as unsafe.

To my understanding, the main difference between "curl directly to bash" and "curl to a temp file, then execute the temp file" is "the attacker could inject additional malicious commands when curl'd directly to bash".

If you're not going to then also read all the source code from the download script (& the source code used to produce the binaries), this suggests the attitude of "I mistrust anything I can't read; but will trust anything I could read (without having to read it)".

It seems more likely that malicious code would be in a precompiled binary, compared to malicious commands injected into "curl to bash". -- Though, if there have ever been any observed cases of a server injecting commands from "curl ... | tee foo | bash", I'd be curious to know about these.

godelski · 2026-02-22T04:56:55 1771736215

  > the attacker could inject additional malicious commands when curl'd directly to bash

There's another issue actually. You're streaming, so ask yourself what happens if the stream gets cut prematurely. I'll give you an example, consider how this like could be cut prematurely to create major issues

  rm -rf /home/theuser/.config/theprogram/build_dir

A malicious attacker doesn't need to inject code, they can just detect the stream and use a line like the above to destroy your filesystem. Sure, you might preserve root but `rm -rf /home` is for all practical purposes destroying the computer's data for most people

Or it doesn't have to be malicious. It can just happen. The best protection is writing functions since those have to be created and so can't execute until fully streamed. But so much bash is poorly written that well... just check out Anthropic's install script...

  > If you're not going to then also read all the source code

Saving the source code still has a benefit. If something does go wrong you can go read it. Probably a good place to start tbh. In fact, if you're streaming and something goes wrong you'll see exactly what the early termination error did.

Is it good security practice? Absolutely not. Is it a hundred times better than curl-pipe-bash? Absolutely.

jolmg · 2026-02-22T17:03:14 1771779794

>> Equivalent but just as unsafe.

> To my understanding, the main difference between "curl directly to bash" and "curl to a temp file, then execute the temp file"...

It's not a temp file in the sense of a regular file. `<()` is also a pipe, hence equivalent. `curl` and `bash` run concurrently.

Running one after the other wouldn't be all that much of an improvement anyway if it's done automatically. One really should manually review the script before running it.

0xbadcafebee · 2026-02-22T03:10:05 1771729805

   t=$(mktemp) && [ -w $t ] && curl foo.sh -o $t && echo "$t lksjdfkljshdkfljhdsklfjhslkdjfhsdlkjfhslkdjhf" | sha256sum -c - && bash $t

Uses standard tmp files, makes sure it's writable (tmp file creation can fail), checks cryptographic hash before executing

godelski · 2026-02-22T04:49:35 1771735775

Sure, but now we're not playing code golf. There's much better commands than the ones I wrote but good luck getting people to run them

0xbadcafebee · 2026-02-22T08:02:56 1771747376

Agreed. People would rather have a cute looking command to copy than security or reliability

wakawaka28 · 2026-02-21T23:54:00 1771718040

If you want to be super pedantic, try to make the command shell-agnostic in case the user is not running bash already.

godelski · 2026-02-22T04:48:35 1771735715

Everything I wrote works in bash and zsh. I think this is going to be fine for the vast majority of people. Tbh, I'm not sure what isn't portable, or at least not portable for everything that the curl-pipe-bash pattern doesn't already work for.

maleldil · 2026-02-22T17:46:22 1771782382

`curl foo.sh | bash` works with any shell as long as bash is installed. `bash <(curl foo.sh)` doesn't work on shells that don't have that process substitution syntax (like fish, and I think nushell)

godelski · 2026-02-22T18:47:13 1771786033

Okay so doesn't work for fish or dash but who is using those? More importantly, who is using those and doesn't know how to convert?

It really just seems like you're trying to start a fight and I don't know why. If you're going to fight with anyone go actually click the link and fight with the one who started it

wakawaka28 · 2026-02-22T23:03:21 1771801401

I'm not sure either but some of those things are bashisms. `<(...)` is a bashism and won't work in some shells. Honestly I ONLY use bash, so it's fine for me, but since you seem to be a pedantic person I thought you should consider trying to solve that puzzle too. Someone might want to copy and paste your command into csh, tcsh, zsh, sh, or ksh. Some of these may support bashisms and some don't. I haven't tried to investigate further either, even to the point of talking to an AI about it, but if you want max nerd cred then you can shoot for it. Keep in mind that csh/tcsh is not entirely POSIX compliant, but it is default on some BSDs.

godelski · 2026-02-23T04:55:08 1771822508

Fwiw it works in zsh and I believe ksh (haven't checked). There's not many people that run csh, tcsh, sh, ksh, or even fish.

I could be more pedantic but I'll trade for practicality. I'm sure we could even do better than this [0] but the problem exists because people are lazy. If you got something that is portable and fits a code-golf like mentality then I'm all ears.

There's bigger problems with the lines I wrote besides portability. They don't stop malicious actors nor provide any security. At best they provide defense against early termination and a log to help debug any damage that was done. Not prevent it. Which those are solvable things! But they're solved by more effort, which unfortunately is a losing battle. So the task isn't to solve all the problems, it's to find something that people might actually do that might actually provide some harm reduction, even if it isn't much. Something is better than nothing, right?

[0] https://news.ycombinator.com/item?id=47107740

godelski · 2026-02-21T20:04:46 1771704286

  > when they are only incentivized to lie, cheat, and steal

The fact that they are allowed to do this is beyond me.

The fact that they do this is destructive to innovation and I'm not sure why we pretend it enables innovation. There's a thousands multi million dollar companies that I'm confident most users here could implement, but the major reason many don't is because to actually do it is far harder than what those companies build. People who understand that an unlisted link is not an actual security measure, that things need to actually be under lock and key.

I'm not saying we should go so far as make mistakes so punishable that no one can do anything but there needs to be some bar. There's so much gross incompetence that we're not even talking about incompetence; a far ways away from mistakes by competent people.

We are filtering out those with basic ethics. That's not a system we should be encouraging

judahmeek · 2026-02-21T22:29:10 1771712950

Because the liars who have already profited from lying will defend the current system.

The best fix that we can work on now in America is repealing the 17th amendment to restrengthen the federal system as a check on populist impulses, which can easily be manipulated by liars.

godelski · 2026-02-22T05:01:44 1771736504

  > Because the liars who have already profited from lying will defend the current system.

Okay? And so we just have to deal with it? Give up? Throw in the towel? Not push back?

  > repealing the 17th amendment

Did you read your first sentence?

*By your own logic,* the liars who have already profited from lying will appoint those who will help them defend the current system.

touristtam · 2026-02-21T22:38:04 1771713484

So your senators were appointed before that? No election needed?

bitwize · 2026-02-21T22:42:33 1771713753

Yes, by state legislatures. The concept was the Senate would reflect the states' interests, whereas the House would reflect the people's interests, in matters of federal legislation.

throwaway2037 · 2026-02-22T12:56:25 1771764985

For those unaware, the German Federal democratic system works in a similar way. They have two houses: the Bundestag (directly elected) and the Bundesrat (appointed by state legistatures). As a outsider, their democracy appears to be very high functioning, which demonstrates this form of democracy can work well.

logifail · 2026-02-22T15:02:54 1771772574

> their democracy appears to be very high functioning, which demonstrates this form of democracy can work well

This probably depends on your definition of "working well".

In March 2025, after the last Federal elections were held in Germany (February 2025), but before the new parliament was constituted (within 30 days of the results?), the new governing coalition engineered a constitutional amendment which required a supermajority which they would not have in the new parliament, so instead they held the vote in the old parliament.

https://www.nytimes.com/2025/03/18/world/europe/germany-debt...

This was perfectly legal, although if you explain it to an outsider it might seem like an abuse of process.

throwaway2037 · 2026-02-22T22:07:05 1771798025

I added that last line as a honeypot, as part of my ongoing project on HN. No matter what I say positive about some country, culture, or institution, someone will pop into the conversation to say: "Yes, but what about this one incident. See, X is not so great after all." I think we need an equivalent of Brandolini's law for counterpoint of negativity in all HN discussions. It is as though people think they are disproving a maths proof by counterpoint. That's not the way the Real World of Human Society works. Weirdly, I see the same pattern on Wiki pages about living people. There is always a section of a bunch of random one-off events trying to discredit the person.

To react to your specific incident, I think a more nuanced view would be to say that all highly functioning democracies have incidents that are "perfectly legal, but appear as an abuse of process". I don't really think that detracts from the overall statement that Germany is a highly functioning democracy. Moreover, highly functional democracies regularly change parliamentary rules to reduce incidents like this.

logifail · 2026-02-23T20:18:28 1771877908

> I added that last line as a honeypot

Ouch.

> No matter what I say positive about some country, culture, or institution, someone will pop into the conversation to say: "Yes, but what about this one incident. See, X is not so great after all."

Isn't this what's called "balanced reporting"? Life is shades of grey.

Aside: not that long ago, half of Western Europe used to look up to Germany as it was the home of "Made in Germany" and the place where the trains ran on time ... <chuckle> ... VW emissions and Deutsche Bahn, how times change.

> I think a more nuanced view would be to say that all highly functioning democracies have incidents that are "perfectly legal, but appear as an abuse of process". I don't really think that detracts from the overall statement that Germany is a highly functioning democracy.

I suspect we may need to hear your definition of "a highly functioning democracy" to assess that claim.

If - hypothetically - your political worst enemies were to pull the same stunt immediately after losing an election, binding the winners of said election, would you be as supportive?

disgruntledphd2 · 2026-02-23T11:16:17 1771845377

> To react to your specific incident, I think a more nuanced view would be to say that all highly functioning democracies have incidents that are "perfectly legal, but appear as an abuse of process". I don't really think that detracts from the overall statement that Germany is a highly functioning democracy. Moreover, highly functional democracies regularly change parliamentary rules to reduce incidents like this.

I agree with the repealing of the debt brake (it was a dumb idea that lead to badness, exported right across the EU), but there's no question that how it happened was pretty un-democratic. Like, procedurally it's fine but it was essentially making a big change in a lame-duck session of Parliament.

None of this disputes the notion that Germany is a high functioning democracy, but I guarantee that this action will be brought up again and again by populists in the future, as an example of how the "elites" don't care about democracy. The sad part is, they will be entirely correct in this particular case.

throwaway2037 · 2026-02-23T22:42:59 1771886579

Another idea for the debt brake: What if they set strict limits, like a max of 3% for 7 years, or 5% for 5 years. Literally, you have a "bank of GDP percent points". You can gain them by running a surplus and spend them by running a deficit. Start the initial bank balance at 25%.

    > but I guarantee that this action will be brought up again and again by populists in the future, as an example of how the "elites" don't care about democracy.

This is a good point that I didn't think about.

shimman · 2026-02-22T03:55:34 1771732534

lol what the fuck, no. Can't believe you look at the current system and think "you know what, political parties should be able to choose senators not the citizens." Good lord.

godelski · 2026-02-21T19:55:37 1771703737

  > - All biometric personal data is deleted immediately after processing.

The implication is that biometric data leaves the device. Is that even a requirement? Shouldn't that be processed on device, in memory, and only some hash + salt leave? Isn't this how passwords work?

I'm not a security expert so please correct me. Or if I'm on the right track please add more nuance because I'd like to know more and I'm sure others are interested

wholinator2 · 2026-02-21T20:20:12 1771705212

I'm not an expert but i imagine bio data being much less exact than a password. Hashes work on passwords because you can be sure that only the exact date would allow entry, but something like a face scan or fingerprint is never _exactly_ the same. One major tenant that makes hashes secure is that changing any singlw bit of input changes the entirety of the output. So hashes will by definition never allow the fuzzy authentication that's required with biodata. Maybe there's a different way to keep that secure? I'm not sure but you'd never be able to open your phone again if it requires a 100% match against your original data.

godelski · 2026-02-21T21:09:03 1771708143

I'd assume they'd use something akin to a perceptual hash.

Btw, hashes aren't unique. I really do mean that an input doesn't have a unique output. If f(x)=y then there is some z such that f(z)=y.

Remember, a hash is a "one way function". It isn't invertible (that would defeat the purpose!). It is a surjective function. Meaning that reversing the function results in a non-unique output. In the hash style you're thinking of you try to make the output range so large that the likelihood of a collision is low (a salt making it even harder), but in a perceptual hash you want collisions, but only from certain subsets of the input.

In a typical hash your collision input should be in a random location (knowing x doesn't inform us about z). Knowledge of the input shouldn't give you knowledge of a valid collision. But in a perceptual hash you want collisions to be known. To exist in a localized region of the input (all z are near x. Perturbations of x).

https://en.wikipedia.org/wiki/Perceptual_hashing

Delk · 2026-02-22T10:50:26 1771757426

> Remember, a hash is a "one way function". It isn't invertible (that would defeat the purpose!). It is a surjective function. Meaning that reversing the function results in a non-unique output.

This is a bit of a nitpick and not even relevant to the topic, but that's not the reason cryptographic hashes are (assumed to be) one-way functions. You could in principle have a function f: X -> Y that's not invertible but for which the set of every x that give a particular y could be tractably computed given y. In that case f would not be a one-way function in the computational sense.

Cryptographic hashes are practically treated as one-way functions because the inverse computation would take an intractable amount of time.

godelski · 2026-02-22T20:20:24 1771791624

Yeah that's a good addition. I think often the words we use can really make things more confusing. Like I hate when people say invertible but in reference to a function that isn't bijective. Why not say reversible? (No complaints with the convention of image/preimage)

Which it's very similar to the problem created by saying "one way". It just isn't one way. Going the other direction is perfectly possible but incredibly hard to find the origin. The visual metaphor I like to use for people is it's like you walk out of a room and into a hallway of doors that are all identical looking. Ignoring the fact that you could just physically turn around, it'd be very hard to figure out which one you actually came from.

But maybe what I like least is that we end up having so many terms for the same general concept. It's one thing when they're discovered independently but I'm pretty confident the computer scientists that pioneered hashes were quite familiar with the mathematics and nomenclature.

  > inverse computation would take an intractable amount of time.

On a real side note I really like this explanation of P vs NP as it explicitly talks about reversibility. https://m.youtube.com/watch?v=6OPsH8PK7xM