More

firasd · 2026-02-09T19:24:02 1770665042

Interesting subplot about the Anthropic ads

> Anthropic’s Super Bowl debut, which drew close attention from rivals, illustrated the risks of that convergence. By centering its message on promising not to include ads on Claude, which OpenAI announced it would begin testing earlier this year. The company made a bet that its ad free principles would resonate with audiences more than product familiarity.

> That wager came amid a significant awareness gap. According to an S&P Global 2025 survey of 1,149 respondents, 73% reported using ChatGPT, followed by Google’s Gemini at 41%. Tools like Perplexity and Claude registered in the single digits, with only 7% of respondents saying they use Claude.

> “The vast majority of consumers don’t know what Claude is. Many people aren’t going to really get the whole ‘Oh, we’re comparing ourselves to chatGPT,’” said Williamson.

ChrisArchitect · 2026-02-09T19:32:36 1770665556

Lots of discussion last week:

Claude Is a Space to Think

https://news.ycombinator.com/item?id=46884883

Sam Altman responds to Anthropic's "Ads are coming to AI. But not to Claude" ads

https://news.ycombinator.com/item?id=46894151

firasd · 2026-02-04T18:09:03 1770228543

It's helpful to keep in mind that 'AI Twitter' is a bubble. Most people just don't have that many 'important' notes and calendar items.

People saying 'Claude is now managing my life!11' are like gearheads messing with their carburetor or (closer to this analogy) people who live out of Evernote or Roam

All that said I've been thinking for a while that tool use and discrete data storage like documents/lists etc will unlock a lot of potential in AI over just having a chatbot manipulating tokens limited to a particular context window. But personal productivity is just one slice of such use cases

atemerev · 2026-02-04T20:49:43 1770238183

I have really severe ADHD. Agents are lifesaving to me. Literally.

what · 2026-02-05T03:50:04 1770263404

Can you explain how? You apparently didn’t die all the way to the few months ago that “agents” became a thing.

scottyah · 2026-02-05T17:21:34 1770312094

If something saves you an hour or two a day, that sounds life "saving" to me. Not just that you were going to die without it.

firasd · 2026-02-01T00:40:11 1769906411

It's interesting to think about how complex the wikipedia text is compared to something like github flavored markdown or even standard html tables (although I guess it eventually renders into standard html so it's not more complex than the latter when all other html elements are considered in addition to <table>)

For example the swatch internet time infobox is dynamically updated

{{short description|Alternate time system by watch maker Swatch}} {{Infobox | image = [[File:Swatch beat Logo.svg|200px|alt=Logo of Swatch Internet Time]] | caption = Logo of Swatch Internet Time | title = Time{{efn|at page generation }} {{purge|(update to view correct time)}} | label1 = 24-hour time (UTC) | data1 = {{nowrap|{{#time:H:i:s}}}} | label2 = 24-hour time (CET) | data2 = {{Time|CET|dst=no|df-cust=H:i:s|hide-refresh=yes}} | label3 = .beat time (BMT) | data3 = {{nowrap|@{{#expr: floor( {{#expr:{{#expr:{{#expr:{{#time:H|now + 1 hour}}3600}}+{{#expr:{{#time:i}}60}}+{{#time:s}}}}/86.4}} )}}}} }}

sph · 2026-02-01T07:57:56 1769932676

Day 1: we’ll adopt a simple markup language because our users are not programmers

Day 2: our users have complicated needs so we’ll basically reinvent Lisp expressions, but worse.

Day N: whatever this markup language is

——

I’ve seen this happen so many times it’s not even funny anymore. Well, at least it’s not YAML.

bawolff · 2026-02-01T12:29:08 1769948948

Ironically it wasn't even intentional. Wikipedia users discovered a way to create an if statement by themselves, originally there was no conditionals.

Anyways, now a days you can use lua, so most of the wikisyntax is just glue code calling a lua program

Inityx · 2026-02-01T09:35:07 1769938507

> so we’ll basically reinvent Lisp expressions, but worse

https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

bawolff · 2026-02-01T01:14:17 1769908457

I always found it ironic that the table syntax is designed to resemble ascii-art type tables, and then literally nobody writes it in a way that looks like an ascii art table.

notpushkin · 2026-02-01T02:18:47 1769912327

Yeah, because it’s a PITA to align everything by hand.

But the spaces around | make it easier to read, than, say, CSV.

echelon · 2026-02-01T04:21:38 1769919698

> Yeah, because it’s a PITA to align everything by hand.

For now. I get the feeling we'll have tooling everywhere that does this soon.

I was recently tab-completing a Markdown table and whatever autocomplete model I had just fixed the table up without any intervention.

notpushkin · 2026-02-01T05:36:39 1769924199

Yeah, it’s not terribly hard to do that even without AI (Prettier can do it, for example). But there’s a lot of places where the tooling just isn’t available. Then again, it’s probably not a big deal if your GitHub comment markup isn’t perfect.

I think the root of the problem is, almost everything else you use in Markdown is easy to do by hand. There’s just no good syntax for tables like this, I guess.

jbaber · 2026-02-01T12:14:25 1769948065

Often enough I just make a regular html table, the 'pandoc -f html -t mediawiki' or 'pandoc -f html -t markdown' as the case may be.

fao_ · 2026-02-01T02:40:20 1769913620

Honestly it continually surprises me how people forget about TSV

It's the perfect format, more or less! CSV, but no difficulty around commas, and the only major risk being an editor that converts tabs to spaces

hmry · 2026-02-01T05:24:07 1769923447

I agree it's great, but that risk is so major that I stopped using it. "There's a 50% chance that your editor will invisibly corrupt the data you enter, and another 30% chance to corrupt the entire file" is just not usable...

Especially in Zed where the only way to switch hard tabs is buried in the settings menu, and impossible to change per buffer.

rf15 · 2026-02-01T07:50:43 1769932243

Lack of control over your editor's behaviour shouldn't be acceptable on this level. Just like making tabs/spaces visible, control like this ahould be a basic feature of every editor.

yellowapple · 2026-02-01T07:50:12 1769932212

You'd think more editors would be smart enough to recognize that it's a TSV file and therefore should preserve the tabs, in much the same way that you'd think editors would be smart enough to recognize that something's a Makefile and therefore should preserve the tabs.

notpushkin · 2026-02-01T18:35:04 1769970904

It gets tricky when you have a TSV inside Markdown. I don’t think I’ve ever seen tabs used for indentation in Markdown in the wild, though it probably does work.

We could, however, make the Tab key insert spaces if the cursor is in the beginning of a line, and a literal \t if it’s in the middle. This way, you can write a TSV table pretty much anywhere you want.

fao_ · 2026-02-01T16:01:33 1769961693

> Especially in Zed where the only way to switch hard tabs is buried in the settings menu, and impossible to change per buffer.

Even vim lets you set that per-buffer so that's more of an editor problem than anything else, lmao

popalchemist · 2026-02-01T07:38:45 1769931525

It's basically wordpress era PHP templating.

stogot · 2026-02-01T00:57:47 1769907467

I’ve spent countless hours at employers fixing Xwiki syntax errors mixed with HTML. The parsing engine must be complex

decimalenough · 2026-02-01T03:52:17 1769917937

That's putting it lightly, since Mediawiki templates are Turing-complete.

I'm not up to speed on my parsers anymore, but I believe Parsoid remains the most complete implementation, while mwlib is a reasonable compromise.

https://www.mediawiki.org/wiki/Alternative_parsers#Known_imp...

firasd · 2026-01-30T21:12:48 1769807568

I'm not very familiar with this model series but tried it when prototyping a potential 'Vibesbench' (my eval) conversation I tried it. Here's Kimi on a Robocop scene

kimi k2 thinking turbo

"He’s not being poetic; he’s weaponizing nostalgia. Quoting Casablanca—the ultimate “beautiful sacrifice” romance—to taunt Morton about corporate backstabbing is pure executive sleaze. It’s the same move as his mahogany desk and flag pin: draping sentimental Americana over naked power. He’s essentially saying, “Isn’t it romantic, Bob? You fought the good fight, now you die.” "

grok 4.1

"So when Dick Jones smirks and says, “Time goes by, it’s the same old story… the fight for love and glory, huh, Bob?” that’s pure Ed Neumeier flexing his love of old Hollywood while simultaneously mocking how these psychopathic ’80s execs think they’re in some grand cinematic drama instead of just murdering a guy for market share."

chatgpt-4o

"The reason the Casablanca reference matters is RoboCop itself is partially a noir—and also a tragedy built inside a genre engine. [… ] So yeah, calling back to Casablanca is not just some meta-joke—it’s a knowing wink that RoboCop is a noir trapped in cybernetic skin. // Even Clarence Boddicker, for all his cartoonish sadism, plays like a twisted stand-in for a noir heavy—talking smooth, dropping nicknames, throwing out casual violence like it’s a punchline."

opus 4.5

"Jones is performing avuncular menace. He's the kind of executive who'd quote old movies while having you murdered, because it's all performance to him, all style. The substance is just power."

firasd · 2026-01-29T18:53:56 1769712836

It's puzzling to me that all this theorizing doesn't just look at the actual effects of AI. It's very non-intuitive

For example the fact that AI can code as well as Torvalds doesn't displace his economic value. On the contrary he pays for a subscription so he can vibe code!

The actual work AI has displaced is stuff like: freelance translation, graphic illustration, 'content writing' (writing seo optimized pages for Google) etc. That's instructive I suppose. Like if your income source can already be put on upwork then AI can displace it

So even in those cases there are ways to not be displaced. Like diplomatic translation work can be part of a career rather than just a task so the tool doesn't replace your 'job'.

omnicognate · 2026-01-29T20:07:09 1769717229

> AI can code as well as Torvalds

He used it to generate a little visualiser script in python, a language he doesn't know and doesn't care to learn, for a hobby project. It didn't suddenly take over as lead kernel dev.

the_af · 2026-01-29T19:44:09 1769715849

I think AI displacing graphics illustrators is a tragedy.

It's not that I love ad illustrations, but it's often a source of income for artists who want to be doing something more meaningful with their artwork. And even if I don't care for the ads themselves, for the artists it's also a form of training.

ares623 · 2026-01-29T23:53:30 1769730810

It’s ok those illustrators can upskill to being devs. I heard it’s a very easy field now thanks to AI.

bojan · 2026-01-29T19:53:37 1769716417

> freelance translation

As someone who has to switch between three languages every day, fixing the text is one of my favourite usages of LLMs. I write some text in L2 or L3 as best as I can, and then prompt an LLM to fix the grammar but not change anything else. Often it will also explain if I'm getting the context right.

That being said, having it translate to a language one doesn't speak remains a gamble, you never know it's correct so I'm not sure if I'd dare use it professionally. Recently I was corrected by a marketing guy that is native in yet another language because I used a ChatGPT translation for an error message. Apparently it didn't sound right.

andersmurphy · 2026-01-29T19:50:43 1769716243

Biggest displacement has to be commenting on HN.

i2km · 2026-01-30T03:36:56 1769744216

Re displacing freelance translation, yes - it can displace the 95% of cases where 95% accuracy is enough. Like you mention though, for diplomatic translations, court proceedings, pacemaker manuals etc you're still going to need a human at least checking every line since the cost of any mistake is so high

firasd · 2026-01-28T12:58:27 1769605107

Both Trump presidencies have really shown how little check there is on the White House when it comes to coordinating among these agencies. Heck literally one of the first the things he did in Jan 2016 is try to find out which park ranger posted a sparse inauguration photo. It wouldn't even occur to me that he was the de facto boss of millions of people in this way

Cause consider the previous status quo. It was considered somehow scandalous for Bill Clinton to have an opinion on what his AG Janet Reno was doing

firasd · 2026-01-26T20:39:29 1769459969

Unfortunately the list of AI edits this person declares at the bottom of their post is self-refuting

If you use AI as a Heads-up Display you can't make a giant scroll of every text change you accepted.

firasd · 2026-01-26T20:21:42 1769458902

I've been analyzing classic "romantic" games using Stockfish with multipv (showing the top 4-5 lines rather than just the best move)

1. Morphy vs. Duke of Brunswick (The Opera Game)

https://lichess.org/study/xAo78qLb/truC6WoM

16. Qb8+.

This is viewed as Morphy doing a stylish Queen sacrifice

But if you look at the MultiPV:

Qb8+* leads to forced mate.

Qc3 or Qb7 drops the advantage significantly.

Qb5 actually allows equality

If he had played anything else, he would have been imprecise. It wasn't a gamble

2. D. Byrne vs. Fischer (Game of the Century)

https://lichess.org/study/UZlSqSLA/Ku9M59je

Fischer plays 17... Be6, leaving his Queen hanging.

Standard narrative: "Fischer offers his Queen for a mating attack!"

Engine reality: 17... Be6 is the correct move. Trying to save the Queen actually loses the advantage.

Byrne taking the Queen (18. Bxb6) was a massive blunder. The engine actually wants Byrne to ignore the Queen and trade off Fischer's Knight on c3. He ends up with a Queen stranded on a3, a total spectator

reassess_blind · 2026-01-26T21:50:10 1769464210

Qb8+ is a fairly obvious mate in 2. I don’t think anyone views it as a gamble.

firasd · 2026-01-26T22:27:37 1769466457

Right. So I guess that's my quibble with the term sacrifice (shared by Rudolf Spielmann)

But what's interesting to me is the counterfactual like outside of these 3 queen moves he would have lost the entire advantage. So it was like a tactical shot like capturing the golden snitch in Harry Potter

reassess_blind · 2026-01-26T22:47:27 1769467647

Sure, I get what you're saying. It's still a sacrifice, but the compensation is just mate in 2, so the there's no real "sacrifice" here.

That being said, any sacrifice that doesn't guarantee a better (or at least equal) position isn't a sacrifice either, it's just "hope chess", aka a bad move. In Blitz or Bullet you can make the case for a "bad" sacrifice for positional complexity and putting time pressure on your opponent to make accurate defensive moves.

In the Opera game, Black just played a poor game start to finish. Giving up the bishop for the knight, pushing the B pawn while the king wasn't castled.

le-mark · 2026-01-27T00:47:02 1769474822

I had two “brilliant” moves in one chess.com game today. One was a bishop sacrifice that would have led to mate in three. The other was a queenside castle that the engine wanted me to do sooner. I suck at chess, although I did see the bishop sacrifice as the right move. The engine rated me at 1500 for the game.

TZubiri · 2026-01-26T20:47:44 1769460464

This is a common theme, gambits are such depending on what your level and calculation depth is.

The queen's gambit opening (almost inarguably a gambit as it is part of a well accepted name of a second move), really isn't a gambit in the sense that you can always recover the pawn, however it is a gambit in the sense that you temporarily give it up.

If we were particularly short sighted, no doubt, responding to an early white bishop threat on g5 or b5 with a knight on f6 or c6 would look like a gambit, as we are sacrificing the knight, but lo and behold, we regain the minor piece afterwards with xf6 or xc6!

The distinction would be whether the gambit or sacrifice is solid or refutable. But it is in both cases a sacrifice.

Trufa · 2026-01-26T21:49:10 1769464150

Most of the times they mean the amazement of just even considering that move a couple of moves ahead and not discarding that branch.

But yes, a true gambit could be considered something that's objectively bad, but humanly makes sense.

firasd · 2026-01-26T18:16:31 1769451391

When I see Dario say all this I wonder what he's seeing that seems so differentiated from my POV

I wonder how much of Anthropic's success is based on a horde of prompt to code examples that they post-train on. We shouldn't assume Opus' web dev skills are the same as general intelligence

Here's something Sonnet 4 did (Jul 2025) that I don't think a human intern would do when I made a wikisource tool (transcript follows)

#-#-#

F:

let’s see how the wikisource tool wrks.. let’s try to match “Mary hath chosen that good part” in one of the bible sources they should have

Claude: Great idea! Let me first search for some biblical content on WikiSource, then use the content match tool.

wiki_source_search

Perfect! Let me try the Douay-Rheims version of Luke, which should have the Mary and Martha story:

wiki_source_page_content_match

No matches! Let me try a slightly different phrasing – maybe it’s “hath chosen the good part”:

wiki_source_page_content_match

Still nothing. Let me try searching for just “Martha” to see if the story is there:

F:

dummy .. why would use kjv wording in a different edition lol

Claude: Haha you’re absolutely right! I was being completely silly – of course the KJV wording “Mary hath chosen that good part” wouldn’t be in the Douay-Rheims translation. Different translations use different language entirely.

What I Did Wrong: • Used King James Version phrasing in a Catholic translation • The Douay-Rheims says “Mary hath chosen the best part” • I was looking for “good part” (KJV wording)

Should Have Started With: Just searching for “Martha” or “Mary” to see what the actual text says in that translation, then used the exact wording I found.

#-#-#

It's almost comedic.. "Perfect! Let me try the Douay-Rheims version of Luke" is where if you were watching a horror movie you'd say "Don't go down to the basement fridge!"

And I think a human when they first see the text match fail would go "hmm--ohh" but Claude starts doing permutations. This I guess is the side effect of Reinforcement Learning and system prompts that amount to saying: "Just do it. Don't ask questions. Just do it."

johnfn · 2026-01-26T19:50:42 1769457042

I find one-off anecdotal examples like this to be a bit like discourse around global warming - "Look at that ridiculous polar vortex we had this week! Global warming can't possibly be a thing!" Of course, a trend line comprises many points, and not every point falls perfectly in the center of the line! I'm not necessarily saying you are right or wrong, but your argument should address the line (and ideally give some reason why it might falter) rather than just a single point on that line.

firasd · 2026-01-26T19:54:24 1769457264

Ah but I'm not arguing about the rate of change in the trend. I'm saying the signals are decoupled. That is to say an LLM can be as good as a programmer as Linus Torvalds without having even basic knowledge-generalization abilities we assume the median human with no specialized skills would have (when given the same knowledge an LLM has)

johnfn · 2026-01-26T20:10:52 1769458252

I think most LLM proponents would say that "basic knowledge-generalization abilities" is on a different, slower trend line.

I mean, you aren't very surprised that your CPU can crush humans at chess but can barely run an image classifier, right? But you probably wouldn't say (as you are saying with LLMs) that ability for a CPU to play chess is "decoupled" from classifying images. Increases in CPU speed improve both. You'd just say that one is a lot harder than the other.

l1n · 2026-01-26T18:43:11 1769452991

> Here's something Sonnet 4 did last year

Hate to be that gal but a lot has changed in the past year

root_axis · 2026-01-26T18:44:55 1769453095

Not with respect to this particular type of failure.

fragmede · 2026-01-26T19:44:15 1769456655

Not sure what you mean.

https://claude.ai/share/8368a541-57d3-4139-88b5-2b007c47c690

Claude finds it's in the KJV first thing.

root_axis · 2026-01-26T20:39:16 1769459956

> Not sure what you mean.

I'm talking about this type of failure, not this exact specific example.

tines · 2026-01-26T19:29:37 1769455777

Last year was a month ago.

fragmede · 2026-01-26T19:38:16 1769456296

"in the past year" is still a 12 month long period, however.

strange_quark · 2026-01-26T19:58:23 1769457503

> When I see Dario say all this I wonder what he's seeing that seems so differentiated from my POV

Billions of dollars

jonas21 · 2026-01-26T19:06:04 1769454364

I have no idea what you are even asking Claude to do here.

firasd · 2026-01-26T19:27:18 1769455638

I was asking it to see if the wikisource tools are working by looking up a Bible quote. There was no ambiguity about the task itself; what I'm saying is that Claude 'knows' a bunch of things (the Bible has different translations) that it doesn't operationalize when doing a task--issues that would would be glaringly obvious to a human who knows the same things

BoiledCabbage · 2026-01-26T20:14:41 1769458481

Maybe I'm missing the point as well, but what did it do wrong?

It seemed like you wanted to see if a search tool was working.

It looked to see. It tried one search using on data source KJ and found no matches. Next question would be is the quote not in there, is there a mis-remembering of the quote or is their something wrong with the data source. It tries an easier to match quote and finds nothing, which it finds odd. So next step in debugging is assume a hypotheses of KJ Bible datasource is broken, corrupted or incomplete (or not working for some other reason). So it searches for an easier quote using a different datasource.

It's unclear the next bit because it looks like you may have interrupted it, but it seems like it found the passage about Mary in the DR data source. So using elimination, it now knows the tool works (it can find things), the DR data source works (it can also find things), so back to the last question of eliminating hypotheses: is the quote wrong foe the KJ datasource, or is that datasource broken.

The next (and maybe last query I would do, and what it chose) was search for something guaranteed to be there in KJ version: the phrase 'Mary'. Then scan through the results to find the quote you want, then re-query using the exact quote you know is there. You get 3 options.

If it can't find "Mary" at all in KJ dataset then datasource is likely broken. If it finds mary, but results don't contain the phrase, then the datasource is incomplete. If it contains the phrase then search for it, if it doesn't find it then you've narrowed down the issue "phase based search seems to fail". If it does find and, and it's the exact quote it searched for originally then you know search has an intermittent bug.

This seemed like perfect debugging to me - am I missing something here?

And it even summarized at the end how it could've debugged this process faster. Don't waste a few queries up front trying to pin down the exact quote. Search for "Mary" get a quote that is in there, then search for that quote.

This seems perfectly on target. It's possible I'm missing something though. What were you looking for it to do?

firasd · 2026-01-26T20:25:39 1769459139

What I was expecting is that it would pull up the KJV using the results returned from the wiki_source_search tool instead of going for a totally different translation and then doing a text match for a KJV quote

sumedh · 2026-01-27T09:49:40 1769507380

> I was expecting is that it would pull up the KJV using the results returned from the wiki_source_search

Did you tell it to do that?

firasd · 2026-01-25T21:42:59 1769377379

I'm beginning to think that origin stories are an underrated way to find these angles. Like why exactly did you start thinking about this topic. I guess the recipe bloggers were on to this with their long rambles about where they first tried this dish (albeit it may have been for SEO too...)