I think what that research found is that _auto-generated_ agent instructions made results slightly worse, but human-written ones made them slightly better, presumably because anything the model could auto-generate, it could also find out in-context.
But especially for conventions that would be difficult to pick up on in-context, these instruction files absolutely make sense. (Though it might be worth it to split them into multiple sub-files the model only reads when it needs that specific workflow.)
Well, you get the warning, but as long as HSTS is not active, you can still click on "Accept the risk and continue" …
[EDIT:] Just checked a bit closer, they are using an LetsEncrypt cert for "cuii.telefonica.de", which is obviously the wrong domain, but as I said above, as long as HSTS is not active for "annas-archive.li", you can still bypass via the button.
If the censoring is at the DNS level, can the admin please replace the domain name in the url with the ip address to which it should resolve? Thank you.
Your country's broken internet is your problem. If you are having DNS queries censored then change your DNS resolver on your client side. If you still get intercepted look into DoH.
Likely. You can go into Nano Banana or ChatGPT right now, upload a pretty architectural rendering, and tell it to make it look old, weathered, winter, etc and it will come out looking very similar. Give it an example to really dial it in.
Note that this is the Flash variant, which is only 31B parameters in total.
And yet, in terms of coding performance (at least as measured by SWE-Bench Verified), it seems to be roughly on par with o3/GPT-5 mini, which would be pretty impressive if it translated to real-world usage, for something you can realistically run at home.
For me, these books are in the rare category of 'wait I didn't know it was allowed to come up with a story _this good_'. I envy all those that have yet to read it for the first time.
Unfortunately proving anything about a concrete imperative implementation is orders of magnitude more complex than working with an abstraction, because you have to deal with pesky 'reality' and truly take care of every possible edge-case, so it only makes sense for the most critical applications. And sometimes there just isn't a framework to do even that, depending on your use case, and you'd have to sit down a PhD student for a while to build it. And even then you're still working with an abstraction of some kind, since you have to assume some kind of CPU architecture etc.
It really is more difficult to work with 'concrete implementations' to a degree that's fairly unintuitive if you haven't seen it first-hand.
I can't fathom how crazy it gets to model once you try to consider compilers, architectures, timings, temperatures, bit-flips & ECCs, cache misses, pseudo and "truly" random devices, threads, other processes, system load, I/O errors, networking.
To me it seems mandatory to work with some abstraction underneath that allows factoring a lot of different cases into a smaller set of possibilities that needs to be analysed.
It's also how we manage to think in a world where tiny little details do give you a likely insignificantly different world-state to think about.
But especially for conventions that would be difficult to pick up on in-context, these instruction files absolutely make sense. (Though it might be worth it to split them into multiple sub-files the model only reads when it needs that specific workflow.)
reply