Hacker Newsnew | past | comments | ask | show | jobs | submit | twosdai's commentslogin

I think the sentiment here is about management's tie of bonuses to near-term stock performance. Maybe not about the market itself, I agree with your view on investors want long term gains over short term fluctuations mostly.

Like the emdash, whenever I read: "this isn't x it's y" my dumb monkey brain goes "THATS AI" regardless if it's true or not.


For me it's the "why this matters", "why this works", etc


Ugh - yes. I’m seriously close to writing a chrome extension just to warn me or block pages that have that phrase…it’s irrational because there are so many legitimate uses, but they are dead to me.


I don't know man, I feel emboldened to keep using emdash exactly because I want to protest against people equating emdash with "AI reply" even though there are very legitimate uses for emdash.


Another common tell nowadays is the apostrophe type (’ vs ').

I don't know personally how to even type ’ on my keyboard. According to find in chrome, they are both considered the same character, which is interesting.

I suspect some word processors default to one or the other, but it's becoming all too common in places like Reddit and emails.


If you work with macOS or iOS users, you won’t be super surprised to see lots of “curly quotes”. They’re part of base macOS, no extra software required (I cannot remember if they need to be switched on or they’re on by default), and of course mass-market software like Word will create “smart” quotes on Mac and Windows.

I ended up implementing smart quotes on an internal blogging platform because I couldn’t bear "straight quotes". It’s just a few lines of code and makes my inner typography nerd twitch less.


Word (you know, the most popular word processor out there) will do that substitution. And on macOS & iOS, it's baked into the standard text input widgets so it'll do that basically everywhere that is a rich text editor.


> According to find in chrome, they are both considered the same character, which is interesting.

Browsers do a form of normalization in search. It's really useful, since it means "resume" will match résumé, unless of course you disable it (in Firefox, this is the "Match Diacritics" checkbox). (Also: itʼs, it's; if you want to see it in action on those two words.)


I’ve been using em-dashes since high school — publishing the school paper and everything. I remain slightly bemused by people discovering em-dashes for the first time thanks to LLMs.

Also, “em-dashes are something only LLMs use” comes perilously close to “huh, proper grammar, must’ve run this by a grammar checker”.


I started using them when I discovered the compose key and it became easy to type them, but I've genuinely considered stopping using for this reason.


the problem with this is that people are adapting their REAL SPEECH to this pattern, so people are actually saying this in real conversations

(we do this all the time; eg. a new popular saying lands in an episode of a tv show, and then other people start adopting it, even subconsciously)


it's the <<<<gold-standard>>>> for spotting LLMs in the wild

(that's what Gemini would say)


I can confirm Ryan is a real human :)


Is there a chance you could ask Ryan if he had an LLM write/rewrite large parts of this blog post? I don't mind at all if he did or didn't in itself, it's a good and informative post, but I strongly assumed the same while reading the article and if it's truly not LLM writing then it would serve as a super useful indicator about how often I'm wrongly making that assumption.


There are multiple signs of LLM-speak:

> Over the past year, we’ve seen a shift in what Deno Deploy customers are building: platforms where users generate code with LLMs and that code runs immediately without review

This isn't a canonical use of a colon (and the dependent clause isn't even grammatical)!

> This isn’t the traditional “run untrusted plugins” problem. It’s deeper: LLM-generated code, calling external APIs with real credentials, without human review.

Another colon-offset dependent paired with the classic, "This isn't X. It's Y," that we've all grown to recognize.

> Sandboxing the compute isn’t enough. You need to control network egress and protect secrets from exfiltration.

More of the latter—this sort of thing was quite rare outside of a specific rhetorical goal of getting your reader excited about what's to come. LLMs (mis)use it everywhere.

> Deno Sandbox provides both. And when the code is ready, you can deploy it directly to Deno Deploy without rebuilding.

Good writers vary sentence length, but it's also a rhetorical strategy that LLMs use indiscriminately with no dramatic goal or tension to relieve.

'And' at the beginning of sentences is another LLM-tell.


> It’s deeper: LLM-generated code, calling external APIs with real credentials, without human review.

This also follows the rule of 3s, which LLMs love, there ya go.


Yeah, I feel like this is really the smoking gun. Because it's not actually deeper? An LLM running untrusted code is not some additional level of security violation above a plugin running untrusted code. I feel like the most annoying part of "It's not X, it's Y" is that agents often say "It's not X, it's (slightly rephrased X)", lol, but it takes like 30 seconds to work that out.


It's not just different way of saying something, it's a whole new way to express an idea.


Can it be that after reading so many LLM texts we will just subconciously follow the style, because that's what we are used to? No idea how this works for native English speakers, but I know that I lack my own writing style and it is just a pseudo-llm mix of Reddit/irc/technical documentation, as those were the places where I learned written English


Yes, I think you're right—I have a hard time imagining how we avoid such an outcome. If it matters to you, my suggestion is to read as widely as you're able to. That way you can at least recognize which constructions are more/less associated with an LLM.

When I was first working toward this, I found the LA Review of Books and the London Review of Books to be helpful examples of longform, erudite writing. (edit - also recommend the old standards of The New Yorker and The Atlantic; I just wanted to highlight options with free articles).

I also recommend reading George Orwell's essay Politics and the English Language.


Given that a lot of us actively try to avoid this style, and immediately disregard text that uses it as not worth reading (a very useful heuristic given the vast amount of LLM-generated garbage), I don't think that would make us more prone to write in this manner. In fact I've actively caught myself editing text I've written to avoid certain LLMisms.


It's unfortunate that, given the entire corpus of human writing, LLMs have seemingly been fine-tuned to reproduce terrible ad copy from old editions of National Geographic.

(Yes, I split the infinitive there, but I hate that rule.)


Great list. Another tell is pervasive use of second-person perspective: “We’ve all been there.” “Now you have what you need.”

As you say, this is cargo cult rhetorical style. No purpose other than to look purposeful.


As someone that has a habit of maybe overusing em dashes to my detriment, often times, and just something that I try to be mindful of in general. This whole thing of assuming that it's AI generated now is a huge blow. It feels like a personal attack.


"—" has always seemed like an particularly weak/unreliable signal to me, if it makes you feel any better. Triply so in any content one would expect smart quotes or formatted lists, but even in general.

RIP anyone who had a penchant for "not just x, but y" though. It's not even a go-to wording for me and I feel the need to rewrite it any time I type it out of fear it'll sound like LLMs.


> RIP anyone who had a penchant for "not just x, but y" though

I felt that. They didn’t just kidnap my boy; they massacred him.


It’s about more than the emdash. The LLM writing falls into very specific repeated patterns that become extremely obvious tells. The first few paragraphs of this blog post could be used in a textbook as it exhibits most of them at once.


couldnt agree more. It's frankly very fatiguing


P

We I 787 I 879-0215 I I I ui 87⁸⁸78⁸877777777 I 77 I⁸7 I 87888887788 I 7788 I I 8 I 8 I 788 I 7⁷88 I 8⁸I 7788 I 787888877788888787 7pm I 87 I⁸77 I ui 77887 I 87787 I 7777888787788787887787877777⁷777⁷879-0215 7777 I 7pm⁷I⁷879-0215 777⁷IIRC 7 7pm 87787777877 I I I⁷⁷7 ui ui 7⁷879-0215 I IIRC 77 ui 777 I 77777 I7777 ui I 7877777778 I7 I 77887 I 87⁷8777⁸8⁷⁷⁸⁸7⁸⁸⁸87⁸⁸⁸⁸8⁷87⁸⁸87888⁷878⁷878887⁸⁸⁸88⁸878888888888888888888887878778788888888787788888888888888888888888888887 ui is 888888888887 7


did you have a stroke?


Wow nice work. Thanks for doing this and writing it up.


One of the things I find interesting as well, is that among many of my friends outside the western world, they typically see: "knowing how something is made" as a western cultural thing. Many of them adopt a "why do you care how it's made, you are a not a manufacturer" type of response. Which i find very interesting.

They still care about the quality of the product, just not the process as much. Not sure if this the case for all people or a generalization. Just something I noticed.


Didn't actually check out the app, but some aspects of application state are hard to serialize, some operations are not reversible by the application. EG: sending an email. It doesn't seem naively trivial to accomplish this, for all apps.

So maybe on some apps, but "all" is a difficult thing.


For irreversible stuff I like feeding messages into queues. That keeps the semantics clear, and makes the bounds of the reversibility explicit.


Tool calls are the boundary (or at least one of them).


> no children were sexually assaulted

Generating pictures of a real child naked is assault. Imagine finding child photos of yourself online naked being passed around. Its extremely unpleasant and its assault.

If you're arguing that generating a "fake child" is somehow significantly different and that you want to split hairs over the CSAM/CP term in that specific case. Its not a great take to be honest, people understand CSAM, actually verifying if its a "real" child or not, is not really relevant.


>actually verifying if its a "real" child or not, is not really relevant.

It's entirely relevant. Is the law protecting victims or banning depictions?

If you try to do the latter, you'll run head first into the decades long debate that is the obscenity test in the US. The former, meanwhile, is made as a way to make sure people aren't hurt. It's not too dissimilar to freedom of speech vs slander.


> Is the law protecting victims or banning depictions?

Both. When there's plausible deniability, it slows down all investigations.

> If you try to do the latter, you'll run head first into the decades long debate that is the obscenity test in the US. The former, meanwhile, is made as a way to make sure people aren't hurt. It's not too dissimilar to freedom of speech vs slander.

There's a world outside the US, a world of various nations which don't care about US legal rulings, and which are various degrees of willing-to-happy to ban US services.


>There's a world outside the US

Cool, I'm all for everyone else banning X. But sadly it's a US company subject to US laws.

I'm just explaining why anyone in the US who would take legal action may have trouble without making the above distinction

Definitely a core weakness of the Constitution. One that assumed a lot of good faith in its people.


It, the difference between calling child pornographic content cp vs CSAM, is splitting hairs. Call it CSAM its the modern term. Don't try to create a divide on terminology due to an edge case on some legal code interpretations. It doesn't really help in my opinion and is not a worthwhile argument. I understand where you are coming from on a technicality. But the current definition does "fit" well enough. So why make it an issue. As an example consider the following theoretical case:

a lawyer and judge are discussing a case and using the terminology CSAM in the case and needs to argue between the legality or issue between the child being real or not. What help is it in this situation to use CP vs CSAM in that moment. I dont really think it changes things at all. In both cases the lawyer and judge would need to still clarify for everyone that "presumably" the person is not real. So an acronym change on this point to me is still not a great take. Its regressive, not progressive.


>It, the difference between calling child pornographic content cp vs CSAM, is splitting hairs.

Yes, and it's a lawyer's job to split hairs. Up thread was talking about legal action so being able distinguish the term changes how you'd attack the issue.

> What help is it in this situation to use CP vs CSAM in that moment. I dont really think it changes things at all.

I just explaied it.

You're free to have your own colloquial opinion on the matter. But if you want to discuss law you need to understand the history on the topic. Especially one as controversial as this. These are probably all tired talking points from before we were born, so while it's novel and insignificant to us, it's language that has made or broken cases in the past. Cases that will be used for precedent.

>So an acronym change on this point to me is still not a great take. Its regressive, not progressive.

I don't really care about the acronym. I'm not a lawyer. A duck is a duck to me.

I'm just explaining why in this legal context the wording does matter. Maybe it shouldn't, but that's not my call.


It's also irrelevant to some extent: manipulating someone's likeness without their consent is also antisocial, in many jurisdictions illegal, and doing so in a sexualized way making it even more illegal.

The children aspect just makes a bad thing even worse and seems to thankfully get some (though enough IMO) people to realize it.


> but it definitely has some bias.

to be frank though, I think this a better way than all people's thoughts all of the time.

I think the "crowd" of information makes the end output of an LLM worse rather than better. Specifically in our inability to know really what kind of Bias we're dealing with.

Currently to me it feels really muddy knowing how information is biased, beyond just the hallucination and factual incosistencies.

But as far as I can tell, "correctness of the content aside", sometimes frontier LLMs respond like freshman college students, other times they respond with the rigor of a mathematics PHD canidate, and sometimes like a marketing hit piece.

This dataset has a consistency which I think is actually a really useful feature. I agree that having many perspectives in the dataset is good, but as an end user being able to rely on some level of consistency with an AI model is something I really think is missing.

Maybe more succinctly I want frontier LLM's to have a known and specific response style and bias which I can rely on, because there already is a lot of noise.


As an outsider of Big co's. I always felt that if youre not on one of the 10-20 awesome product teams. Eg, Google maps, aws lambda, windows core os. Something along those lines. It seems like a territory for justification Olympics.

Just my view as a dev who's largest co was like 500 people. ~100 engineers.


Couldn't be more on the nose.

Big companies are significantly better to work in when you're either (a) in sales with a clear path to hitting/exceeding quota, (b) a strategic revenue generator, or (c) a super hot and extremely well funded corporate initiative (basically all AI projects right now).

The money tap is always on, you get all the cool toys, travel perks are great, and you get to work on amazing stuff without as much red tape.


Yeah, I was working on more of an infra thing (involving caching and indexing). Certainly important given the size of the company, but not something that gets lots of hype or sexiness.

There were occasional bits of ambition to occasionally work on interesting stuff, but it was mostly a “keep the lights on and then figure out how to make yourself seem important”.

One of my biggest pet peeves is when engineers say that we can’t do something because we would have to learn something new. I got into several arguments because I wanted to rewrite some buggy mutex-heavy code (that kept getting me paged in the middle of the night) with ZeroMQ, and people acted like learning it was some insurmountable challenge. My response would usually be something to the effect of “I’m sorry, I was under the impression that we were engineers, and that we had the ability to learn new things”.

As I said, complaints about my attitude weren’t completely unfounded, but it’s just immensely frustrating for people using their unwillingness to learn new things as an excuse to keep some code in a broken state.


You're complaining about resume driven development in the same thread you're upset they wouldn't let you rewrite everything in ZeroMQ? That is a very inconsistent position, and reflects extreme confirmation bias, and by itself justifies that you may need to look in the mirror.


I didn’t want to rewrite everything in ZeroMQ. I wanted to rewrite one 2000 line service with ZeroMQ because the service was already broken and I was the only person who was dealing with the consequences because I was the only person who got paged for that particular service.

Usually I advocated for doing things a more boring way, and I certainly don’t agree with making every damn thing an “initiative”, which was my biggest issue at BigCo.

I don’t think it’s inconsistent. I wanted to use the right tool for the right job. Usually I can get by with Java’s built in tooling, and that was my initial attempt at a rewrite, but I ended up trying to re-invent a bunch of concurrency patterns with BlockingQueue and I found that literally everything I was spending a lot of (my own free) time was handled in like four lines of ZeroMQ.

I have a single line on my resume for ZeroMQ as a keyword, despite having used it in many, many projects, so it certainly wasn’t using explicitly to pad my resume.


If @tombert worked for me at BigCo, I'd give them a big raise for doing the exact right thing. This is Employee of the Year performance.

@tombert recognized that the homegrown tech was awful (*) and proposed a mature, reliable, well documented and supported, low-cost, utterly mainstream and mature replacement. That's not resume packing, that's pragmatic, rational software design.

@tombert also knows that every tech professional must routinely learn new things, otherwise they'll be unemployable dinosaurs long before retirement age. Tech dinosaurs aren't a pretty thing in the workplace.

(*) Especially awful because these are mutex and concurrency bugs, and @tombert knew that nondeterministic bugs cost expensive resources to investigate, find, and fix, simply because these bugs are unreproducible. Unlike straightforward deterministic bugs, concurrency bugs are open-ended tar pits that managers and engineers despise. These kind of bugs can eat up a project's schedule and energy.

edited: formatting bug. Fortunately it was reproducible!


I mean, obviously I agree with my own perspective :), but I do kind of understand the pushback to a certain extent.

Of course there are an effectively infinite number of potential routes you can go down with software, and of course you can't learn all of them, and you can't import every single helper library you'd like to.

We all like to think that the way we want to do things is objectively the best way, and I do think that there are objectively better ways of doing some things involving concurrency and the like, but a lot of the time it is subjective.

But as you said, I wasn't trying to import a library that was the latest hype on Hacker News; it's ZeroMQ. It's fast, well documented, easy to use, and very mature software with very good libraries in every major programming language, and it implements nearly every concurrency pattern that you'd want to use for most projects, and importantly it implements them correctly, which can be harder to do than it sounds.

As I said, I did have an attitude problem at that point in my career. I can blame it on a lot of stuff (untreated sleep apnea being a big one, as I later discovered), but I will admit I probably could have and should have been a bit more diplomatic in how I proposed these things.

I didn't really blame the person who wrote the code for it breaking (who had since left the company), because writing correct concurrent software is hard, I'm sure he had a reason at the time for doing it the way that he did, and of course all non-trivial software has bugs. What bothered me is that I had been designated at the sole person to deal with these issues, so I was the only person who had to deal with the consequences with these actions. The code hadn't been touched by anyone in years outside of adding basic NPE checks, and so I felt like people should let me try and fix it in a way that I thought would be less error-prone, and if it breaks I'd be the one forced to fix it anyway, and I could feature flag the hell out of it in case my code didn't work.


> it implements nearly every concurrency pattern that you'd want to use for most projects, and importantly it implements them correctly, which can be harder to do than it sounds.

This is key. Writing nontrivial and bug-free concurrent code is extremely hard, it's like writing absolutely solid crypto code. Both look easy, both are incredibly hard and anyone who doesn't know that, shouldn't be writing code at those layers.

Recommending a proven, off-the-shelf concurrency technology is the mark of an experienced and thoughtful software architect.


I think i found something even better. I'm just adjacent to the big money maker. We keep folks on the page a little longer but don't need to concern ourselves with revenue and ads. Just make it good so folks stick around but important enough that we won't get axed.


> Big companies are significantly better to work in when you're either (...)

You're basically stating that people who are hired to staff projects that are superfluous secondary moonshots are more likely to be fired than those who maintain core business areas. That's stating the obvious. When a company goes through spending cuts, the first things to go are the money sinks and fluff projects that are not in any key roadmap. This is also why some companies structure their whole orgs around specific projects and even project features, because management limits the impact of getting rid of entire teams by framing that as killing projects or delays in roadmap.


If youre able to do some networking and tell people verbatim what you just wrote. I think that will help a lot more than grinding kernel logic or whatever.

If youre not considering it, I would reccomend trying to find a local hacker club like defcon or 2600. There are usually a few embedded folks who go or are affiliated with those.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: