While I agree with you that it's a thought-terminating cliche, I would caution that humans have historically been very inaccurate with knowing what traits specifically are good vs. bad, while also very strong on enforcement of ingroup/outgroup and purity dynamics.
We could all be hyper-muscular (from that Myostatin gene) and have tetra-chromatic vision*, but that leads to the joke about how "in the future there will be three genders: kpop, furry, and tank", where kpop represents normative beauty standards, furry represents self-expression, and tank represents hyper-optimising for niche goals like being strong.
On the more near-term impacts, before we're ready for me to get turned into an anthro-wolf, if we all end up with our genomes subject to regular updates like our software currently is, some of us are inevitably going to face our cells getting bricked while we're still made of them.
Given the nature of large organisations, and that they are obviously important vectors for anyone who desires to attack whoever they please, I assume that all the Big Tech companies have not only US government agents embedded inside them, but also non-US government agents and criminal agents also, and all of them will be attempting to exploit whatever they can including Mythos (and everything else) for gain.
However, the rate of change we're currently seeing says that before 2030 everyone will be able to access this kind of power on a local model hosted on their own phone, which means it goes from being "the NSA are using it to hack BYD, while China is using it to hack Premier Election Solutions' voting machines" becomes "local drug dealer hacks all local surveillance cameras[0] they walk past, replaces footage with deepfake of someone else so they can't get recognised in footage".
Ideally, things like Mythos close the security vulnerabilities. I am deeply cynical that this will be successful, because of all the times a boss has demanded special privileges to the detriment of security (famously both Hillary Clinton and Donald Trump, the latter in multiple different ways).
[0] Before anyone says "surely the CCTV cameras would be secure": (1) before LLMs happened, people were saying "surely we'd never let the AI out of the box and onto the internet"[1]; (2) see all recent news about Flock.
National security risk from the cameras would be a good point, except for them then saying:
Farley spoke with White House officials about the issue, arguing that Chinese companies should be required to form joint ventures and hand controlling stakes to US automakers in order to sell their vehicles here, according to a Bloomberg report.
The "huge direct support" he claims Chinese automakers receive from their government would be a good point, except for how Tesla infamously got $15-16bn of their $45-ish bn lifetime net income from the government both directly from subsidies and indirectly from other businesses via regulatory credits.
Jobs would be a great point, but I find it hard to believe they actually care about jobs per se and would be very eager to replace all the workers with robotics (presumably mostly of the fixed arm variety rather than anything of the androids in the news for the last few years, despite the sales pitches the latter get).
> LLM output doesn't have the variety of human output, since they operate in fixed fashion - statistical inference followed by formulaic sampling.
This is the wrong thing to look at; your chess analogy is much stronger, the detection method similar (if you can figure out a prompt that generates something close to the content, it almost certainly isn't human origin).
But to why the thing I'm quoting doesn't work: If you took, say, web comic author Darren Gav Bleuel, put him in a sci-fi mass duplication incident make 950 million of him, and had them all talking and writing all over the internet, people would very quickly learn to recognise the style, which would have very little variety because they'd all be forks of the same person.
Indeed, LLMs are very good at presenting other styles than their defaults, better at this than most humans, and what gives away LLMs is that (1) very few people bother to ask them to act other than their defaults, and (2) all the different models, being trained in similar ways on similar data with similar architectures, are inherently similar to each other.
An LLM is just computer function that predicts next word based on the input you give it. It doesn't make any difference what the input is (e.g. please respond in style X) - the function doesn't change, and the statistical signature of how it works will still be there.
If you don't believe me, try it for yourself. Ask an AI to generate some text and give it to the AI detector below (paste your text, then click on scan). Now ask the AI to generate in a different style and see if it causes the detector to fail.
I can't use that linked app, paywall immediately. Unlike the person you were replying to here[0], I do not claim that this is impossible:
LLM is indeed just computer function that does stats. And our brains are just electro-chemistry that does stats. This is why stylometric analysis of human writing is a thing.
My previous experience with things such as you have linked to, is they used to be quite poor. I assume they're better since then, but then again so are the models.
> I assume they're better since then, but then again so are the models.
Yes, but "better" means different things for each of these.
Detectors are trying to get better at distinguishing human from LLM-generated text.
LLMs are being improved to generate more useful (and benchmark maxxing) outputs, not to attempt to avoid detection.
LLMs are in fact explicitly trained to be as predictable as possible. The training goal is to minimize continuation prediction errors, which means they are in effect being trained to generate output where each word can be predicted by what came before it (which we can contrast to a human who tries to spice it up and keep it interesting by not being too predictable!).
RL post-training, which is especially used for computer code and math, is going to change this word-by-word predictability (detectability) a bit since the focus is now on a longer term goal rather than next word, but to some extent you could also view it as just steering/narrowing the output of the model towards that goal, not totally overriding the next-word statistics.
I don't know if there are AI detectors specifically trained to detect AI code rather than prose, but I'd expect that is more difficult to do, both because of the RL factor, and because computer code is so predictable in the first place - adhering to rigid syntax etc.
We've been able to do that since at least 1897. It just wasn't important enough to bother with for a lot of the intervening time. It may still not be, we'll find out by people doing it or not doing it.
I've been using 5.4 recently, and even on "extra high" some of the tests it wrote were opening the source code and doing a regex to confirm the presence (or in some cases the absence) of specific substrings. It wasn't running the code to confirm behaviour, and the regexes didn't even do a basic check to confirm the text wasn't commented out (not that it would've been sufficient if they had, this is just to illustrate how bad it was).
So, yeah. I'd guesstimate this model was fine 75% of the time, mediocre 15-20%, and actively bad 5-10% of the time. How valuable it is depends on how much energy you can spare as a human on spotting the bad.
Sure, but also the EU is comparably as weak over its member states as the US Federal government was over American states in the Articles of Confederation era. This is how Hungary was able to paralyse the collective response against Russia.
Nevertheless, extraditions based on international mandates are usually respected (terms and conditions may apply, see Greece or Italy). Wanted people often go to Serbia nowadays, to give a successful example.
Indeed. But I did write "will find friendly arms in China and Europe", and Greece, Italy, and indeed Serbia, are in Europe.
The whole continent != nation thing is clearer with the EU != Europe (due to the EU not even being a nation yet) than with the American nation != The Americas.
Even then, don't underestimate rules-lawyering of laws: I wish to suggest that the USA is going down the path of "rogue state", and that extradition treaties may have clauses (either explicitly in treaty text* or implicitly via the European Convention on Human Rights) protecting individuals from the risk of a death penalty, which may end up getting invoked due to the US having the death penalty.
Article 13 (``Capital punishment'') provides that when an offense for which extradition is sought is punishable by death under the laws in the requesting State but not under the laws in the requested State, the requested State may grant extradition on condition that the death penalty shall not be imposed or, if for procedural reasons such condition cannot be complied with by the requesting State, on condition that if imposed the death penalty shall not be carried out.
If there's a loss of trust that the US will honour its obligations, and in other cases besides extradition this has already happened, what then?
I increasingly avoid installing apps when I have an alternative, but it wasn't always so. Somehow I've managed to be an iOS app developer since the first retina iPod touch.
> Particularly machine translations are no worse than what an untrained native speaker would come up with, and much better than traditional translators
Sometimes. I use Google Translate (literally the same architecture, last I heard), and when it works, great. Every single time I've tried demonstrating that it can't do Chinese by quoting the output it gives me from English-to-Chinese, someone replies to tell me that the translated text is gibberish*.
Even with an easier pair, English <-> German, sometimes I get duplicate paragraphs. And there's definitely still cases where even the context-comprehension fails, as you should be able to see from going to a random German website e.g. https://www.bahn.de/ in e.g. Chrome and translating it into English and noticing the out-of-place words like how destination is "goal", the tickets are "1st grade" and "2nd grade" instead of class.
* I'm curious if this is still true, so let's see:
I'm not sure if we're on the same page. I mean LLMs right? Not whatever Google Translate and DeepL use. The latter was better than gtrans when it launched, nowadays it's probably similar idk, and both are machine learning clearly, but the products(' quality) predates LLMs. They're not LLMs. They haven't noticeably improved since LLMs. Asking an LLM produces better output (so long as the LLM doesn't get sidetracked by the text's contents). Presumably also orders of magnitude higher energy consumption per word, even if you ignore training
I agree that Google Translate, now on par with DeepL's free product afaik (but I'm not a gtrans user so I don't know), is decent but not a full replacement for humans, and that LLMs aren't as good as human translations either (not just for attention reasons), but it's another big step forwards right?
I'm not sure what DeepL uses, but Google invented the Transformer architecture, the T in GPT, for Google Translate.
IIRC, the original difference between them was about the attention mask, which is akin to how the Mandelbrot and Julia fractals are the same formula but the variables mean different things; so I'd argue they're basically still the same thing, and you can model what an LLM does as translating a prompt into a response.
We could all be hyper-muscular (from that Myostatin gene) and have tetra-chromatic vision*, but that leads to the joke about how "in the future there will be three genders: kpop, furry, and tank", where kpop represents normative beauty standards, furry represents self-expression, and tank represents hyper-optimising for niche goals like being strong.
On the more near-term impacts, before we're ready for me to get turned into an anthro-wolf, if we all end up with our genomes subject to regular updates like our software currently is, some of us are inevitably going to face our cells getting bricked while we're still made of them.
* I don't know how that works so here's the wikipedia page: https://en.wikipedia.org/wiki/Tetrachromacy#Humans
reply