Hacker Newsnew | past | comments | ask | show | jobs | submit | martijnvds's commentslogin

240 = 2 x 120, or 4 x 60 (or 8 x 30)

Wouldn't that leave ways to do "phone phreaking" style attacks, because it's an in-band signal?

In theory you still use the same blob (i.e. the prompt) to tell the model what to do, but practically it pretty much stops becoming an in-band signal, so no.

As I said, the best way to do this is to inject a brand new special token into the model's tokenizer (one unique token per task), and then prepend that single token to whatever input data you want the model to process (and make sure the token itself can't be injected, which is trivial to do). This conditions the model to look only at your special token to figure out what it should do (i.e. it stops being a general instruction following model), and only look at the rest of the prompt to figure out the inputs to the query.

This is, of course, very situational, because often people do want their model to still be general-purpose and be able to follow any arbitrary instructions.


> and make sure the token itself can't be injected, which is trivial to do

Are they actually doing this? The stuff that Anthropic has been saying about the deliberate use of XML-style markup makes me wonder a bit.


> Are they actually doing this? The stuff that Anthropic has been saying about the deliberate use of XML-style markup makes me wonder a bit.

Yes.

The XML-style markup are not special tokens, and are usually not even single-token; usually special tokens are e.g. `<|im_start|>` which are internally used in the chat template, but when fine-tuning a model you can define your own, and then just use them internally in your app but have the tokenizer ignore them when they're part of the untrusted input given to the model. (So it's impossible to inject them externally.)


Eventually we will rediscover the Harvard Architecture for LLMs.

I've seen people brag about it in their resumes, so I assume it helps them find (better paying?) work.

This kind of id mapping works as a mount option (it can also be used on bind mounts). You give it a mapping of "id in filesystem on disk" to "id to return to filesystem APIs" and it's all translated on the fly.


Thank you! Going to ask an LLM to lecture me on this when I have some time; good to see that humans are still the best at giving just the right amount of explanation :)


It's called us-east-1?


AWS China is a completely separate partition under separate Chinese management, with no dependencies on us-east-1. It also greatly lags in feature deployments as a result.


Llamas are hot again.


Well, was. Then Facebook AI made a side-step, and stopped focusing on downloadable LLM models, and Ollama is now trying to squeeze their users so they can show profits. r/LocalLlama is also a former shadow of itself, with the top moderator trying to move the community off reddit.

Seems Llamas will disappear as quickly as they became trendy.


printf("Got here, x=%u"\n", x);


I'm not too offended by this answer. We all reach for it before we seriously think about the debugger. But debugging should be treated as a specialist skill that's almost as complex as programming, and just as empowering. There are two ways I can think of right away, in which the mastery of debuggers can enrich us.

The first is that it gives you an unparalleled insight into the real stuff behind the scenes. You'll never stop learning new things about the machine with a debugger. But at the minimum, it will make you a much better programmer. With the newly found context, those convoluted pesky programming guidelines will finally start to make sense.

The second is that print is an option only for a program you have the source code for. A debugger gives you observability and even control over practically any program, even one already in flight or one that's on a different machine altogether. Granted, it's hard to debug a binary program. But in most cases on Linux or BSD, that's only because the source code and the debugging symbols are too large to ship with the software. Most distros and BSDs actually make them available on demand using the debuginfod software. It's a powerful tool in the hands of anyone who wishes to tinker with it. But even without it, Linux gamers are known to ship coredumps to the developers when games crash. Debugging is the doorway to an entirely different world.


“The most effective debugging tool is still careful thought, coupled with judiciously placed print statements.”

- Brian Kernighan


That's the holy Unix justification for self-flagellation via deficient tooling and you're sticking to it.


In my experience, it's a superior approach for code you wrote yourself in a repeatable crash. You have the whole programming language at your disposal for building a condition corresponding to your bug, and any kind of data dumping.

I fall back on debuggers when the environment is hostile: Half understood code from someone else, unreliable hardware (like embedded), or debugging memory dumps.

But before both, the initial approach is thinking deep and hard, and reviewing all available evidence like logs. If this is not enough, I try to add better troubleshooting abilities for the future.


Although that was true at the time, it was before the creation of modern omniscient debuggers like Pernosco (<https://pernos.co/>).


Very convenient to use LLMs for the that "Please add debug fprintf(stderr, {print var x y here})". The "please comment out the debug fprintfs"


I could be wrong, but does it have an extra "? BTW I like to use 0x%x or 0x%lx for certain projects.


> the trains fail with current amount of public funding, I wonder if less funding will improve the situation" is not good logic

Tell that to the current government (and most of the previous governments in recent years).

You can't put money into it! Guess NS will just have to increase ticket prices _again_.


Minecraft ships it I think?


Minecraft as an example of desirable graphic properties :D

It sure has its style and I stand by what I've always maintained about gameplay being infinitely more important than polished graphics, but that does sound ironic to my ears!


Even in Minecraft it looks bad. Some CJK characters are Serif for some reason.


Tell that to the right-wing nutjobs who all want their "<country code>XIT"


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: