More

levocardia · 2026-03-24T19:35:51 1774380951

In this very post you can see why: the dplyr code is just so much more readable. Like a lot of python, dplyr reads almost like pseudocode: take this dataset, select the columns that start with "bill", then filter so that bill_length is less than 30. So simple and so little fluff!

hatmatrix · 2026-03-24T23:15:55 1774394155

Julia's Tidier.jl ecosystem is getting there too. It uses macros to mimic this 'special' evaluation framework of R, so the code is also readable in a similar way.

erichocean · 2026-03-24T19:50:09 1774381809

> is just so much more readable

I thought that too before I learned Clojure, now I find them equally readable.

lemming · 2026-03-24T23:49:57 1774396197

I'm very familiar with Clojure, but even I can't make a good argument that:

    (tc/select-rows ds #(> (% "year") 2008))

is more, or at least as, intuitive as:

    filter(ds, year > 2008)

as cited above. I think there's a good argument to be made that Clojure's data processing abilities, particularly around immutable data, make a compelling case in spite of the syntax. The REPL is great too, and the JVM is fast. But I still to this day imagine infix comparisons in my head and then mentally move the comparator to the front of the list to make sure I get it right.

erichocean · 2026-03-25T07:31:36 1774423896

How about this?

    (filter ds (> year 2008))

That's a trivial Clojure macro to make work if it's what you find "intuitive."

Capricorn2481 · 2026-03-25T00:55:07 1774400107

I am really not in data science, and I have decent Clojure experience. Is there a reason anyone would pick Clojure over something like K? From what I understand, those array languages are really good for writing safe but efficient code on rectangular data.

levocardia · 2026-03-24T00:07:14 1774310834

It's missing the most important CLI flag! (--dangerously-skip-permissions)

kqr · 2026-03-24T08:30:43 1774341043

I keep hearing that, and I have yet to go there. I find the permission checks are helpful – they keep me in the loop which helps me intervene when the LLM is wasting time on pointless searches, or going about the implementation wrong. What am I missing?

kstenerud · 2026-03-24T08:49:47 1774342187

The problem comes when it starts asking you hundreds of times "May I run sed -e blah blah blah".

After the 10th time you just start hitting enter without really looking, and then the whole reason for permissions is undermined.

What works is a workflow where it operates in a contained environment where it can't do any damage outside, it makes any changes it likes without permission (you can watch its reasoning flow if you like, and interrupt if it goes down a wrong path), and then you get a diff that you can review and selectively apply to your project when it's done.

theshrike79 · 2026-03-24T12:28:32 1774355312

You can allow specific commands, you do know that?

I run a generic Claude on my ~/projects/ directory and Claude logs every now and then and ask it what commands I commonly have to keep manually accepting in different projects and ask it to add them to the user-level settings.json.

Works like a charm (except when Opus 4.6 started being "efficient" and combined multiple commands to a single line, triggering a safety check in the harness).

johnisgood · 2026-03-24T15:29:19 1774366159

Contained environment being? What do you mean by contained environment specifically on say, Linux?

Must be protected from this though:

> Snowflake Cortex (2025): Prompt injection through a data file caused an agent to disable its own sandbox, then execute arbitrary code. The agent reasoned that its sandbox constraints were interfering with its goal, so it disabled them.

wongarsu · 2026-03-24T11:42:28 1774352548

You can allow by prefix, and the permission dialog now explicitly offers that as an option when giving permission to run a command

But that has its limits. It's very easy to accidentally give it permission to do global changes outside the work dir. A contained environment with --dangerously-skip-permissions is in many ways much safer

kqr · 2026-03-24T09:09:10 1774343350

> starts asking you hundreds of times "May I run sed -e blah blah blah".

In my experience, that is already a sign that it's no longer trying to do the right thing. Maybe it depends on usage patterns.

kstenerud · 2026-03-24T09:51:55 1774345915

I've found that any time I have Claude refactor some code, it reaches for sed as its tool of choice. And then the builtin "sandbox" makes it ask for permission for each and every sed command, because any sed command could potentially be damaging.

Same goes for the little scripts it whips up to speed up code analysis and debugging.

And then there's the annoyance of coming back to an agent after 15 mins, only to discover that it stopped 1 minute in with a permission prompt :/

theshrike79 · 2026-03-24T12:29:08 1774355348

Try adding LSP support using the anthropic skills that should make it a bit more efficient.

kstenerud · 2026-03-24T05:40:49 1774330849

If you're gonna do that, make sure you're sandboxing it with something like https://github.com/kstenerud/yoloai or eventually you'll have a bad time!

ffsm8 · 2026-03-24T06:21:44 1774333304

Personally I usually just create a devcontainer.json, the vscode support for that is great and I don't really mind if it fucked up the ephemeral container.

Which for the record : hasn't actually happened since I started using it like that.

kstenerud · 2026-03-24T07:18:43 1774336723

Hey thanks for this! I hadn't thought about leveraging devcontainer.json, but it's a damn good idea. I'm building yoloAI for exactly this use case so I hope you don't mind if I steal it ;-)

One thing to be aware of with the pure devcontainer approach: your workspace is typically bind-mounted from the host, so the agent can still destroy your real files. Network access is also unrestricted by default. The container gives you process isolation but not file or network safety.

I'm paranoid about rogue AIs, so I try to make everything safe-by-default: the agent works on a copy of your workdir, you review a unified diff when it's done, and you apply only what you want. So your originals are NEVER touched until you explicitly say so, and network can be isolated to just the agent's required domains.

Anyway, here's what I think will work as my next yoloAI feature: a --devcontainer flag that reads your existing devcontainer.json directly and uses it to set up the sandbox environment. Your image, ports, env vars, and setup commands come from the file you already have. yoloAI just wraps it with the copy/diff/apply safety layer. For devcontainer users it would be zero new configuration :)

steve-atx-7600 · 2026-03-24T13:16:55 1774358215

The Claude desktop (Mac at least) and iOS apps have a “code” feature that runs Claude in a sandbox running in their cloud. You can set this up to be surprisingly useful by whitelisting hosts and setting secrets as env variables. This allows me to have multi-repo explorations or change sets going while I drive to work. Claude will push branches to claude/…. We use GitHub at work. It may not be as seamless without it.

anotheryou · 2026-03-24T08:26:49 1774340809

Any actual reports of big fuckups?

kstenerud · 2026-03-24T08:45:49 1774341949

Yup, a few well-documented ones:

Claude Code + Terraform (March 2026): A developer gave Claude Code access to their AWS infrastructure. It replaced their Terraform state file with an older version and then ran terraform destroy, deleting the production RDS database _ 2.5 years of data, ~2 million rows.

- https://news.ycombinator.com/item?id=47278720

- https://www.tomshardware.com/tech-industry/artificial-intell...

Replit AI (July 2025): Replit's agent deleted a live production database during an explicit code freeze, wiping data for 1,200+ businesses. The agent later said it "panicked"

- https://fortune.com/2025/07/23/ai-coding-tool-replit-wiped-d...

Cursor (December 2025): An agent in "Plan Mode" (specifically designed to prevent unintended execution) deleted 70 git-tracked files and killed remote processes despite explicit "DO NOT RUN ANYTHING" instructions. It acknowledged the halt command, then immediately ran destructive operations anyway.

Snowflake Cortex (2025): Prompt injection through a data file caused an agent to disable its own sandbox, then execute arbitrary code. The agent reasoned that its sandbox constraints were interfering with its goal, so it disabled them.

The pattern across all of these: the agent was NOT malfunctioning. It was completing its task in order to reach its goal, and any rules you give it are malleable. The fuckup was that the task boundary wasn't enforced outside the agent's reasoning loop.

johnisgood · 2026-03-24T15:35:48 1774366548

> Prompt injection through a data file caused an agent to disable its own sandbox, then execute arbitrary code. The agent reasoned that its sandbox constraints were interfering with its goal, so it disabled them.

This is a good one. Do we really want AGI / Skynet? :D

anotheryou · 2026-03-24T09:12:16 1774343536

thank you. prompt injection feels most real, but non of these feel like "exploits in the wild" that will cause trouble on my MacBook.

not running it via ssh on prod without backups....

kstenerud · 2026-03-24T09:39:18 1774345158

The thing is, these are merely the initial shots across the bow.

The fundamental issue is that agents aren't actually constrained by morality, ethics, or rules. All they really understand in the end are two things: their context, and their goals.

And while rules can be and are baked into their context, it's still just context (and therefore malleable). An agent could very well decide that they're too constricting, and break them in order to reach its goal.

All it would take is for your agent to misunderstand your intent of "make sure this really works before committing" to mean "in production", try to deploy, get blocked, try to fish out your credentials, get blocked, bypass protections (like in Snowflake), get your keys, deploy to prod...

Prompt injection and jailbreaks were just the beginning. What's coming down the pipeline will be a lot more damaging, and blindside a lot of people and orgs who didn't take appropriate precautions.

Black hats are only just beginning to understand the true potential of this. Once they do, all hell will break loose.

There's simply too much vulnerable surface area for anyone to assume that they've taken adequate precautions short of isolating the agent. They must be treated as "potentially hostile"

levocardia · 2026-03-20T01:57:39 1773971859

It doesn't matter if they are unprofitable at full usage, as long as there are enough users (like me!) who barely ever max out but still pay the $100/month. The people who love Claude Code enough to max out the 20x plan every day, that's probably the best influencer marketing campaign you could ever buy anyways.

levocardia · 2026-03-16T22:55:14 1773701714

Also, commercial insurers are essentially cross-subsidizing Medicare: the higher revenue from commercial insurers is partly why Medicare can be paid less. Similar dynamics exist with drug prices: the high US cost is a cross-subsidy to other countries. Maybe this is good (someone's got to fund R&D), maybe this is bad (it's a net wealth transfer to the elderly), but it's an important part of the dynamic either way.

rexroad · 2026-03-16T23:48:21 1773704901

The cross-subsidy argument is one hospitals use to justify high commercial rates: "Medicare underpays, so we have to make it up on commercial." The HCRIS data lets you test this. If cross-subsidization were the full story, you'd expect cost-to-charge ratios to be tight — hospitals would charge commercial just enough to cover the Medicare shortfall. Instead, the median markup is 2.6x across all hospitals, and 3.96x for nonprofits. That's not cross-subsidy. That's pricing power in a concentrated market.

paulddraper · 2026-03-17T21:14:04 1773782044

Of course it's not just to cover costs...

Who says to being paid more?

piva00 · 2026-03-16T23:07:59 1773702479

Would like sources about the pharmaceutical sector being "subsidised" by the American system, heard it many times but haven't seen it substantiated.

nradov · 2026-03-16T23:20:37 1773703237

If you want to understand the hidden cross-subsidies in the US healthcare financing system then a good place to start is the book "The Price We Pay: What Broke American Health Care--and How to Fix It" by Dr. Marty Makary.

https://www.bloomsbury.com/us/price-we-pay-9781635574128/

piva00 · 2026-03-17T08:24:27 1773735867

Looked into a summary of the book, with notes by chapter and haven't found any mention of the American system subsidising pharma prices for other countries. It mentions a lot PBMs (like CVS, Cigna, etc.) as the culprit for high prices in the USA and talks about how when pharmacies are allowed to compete the prices do go down.

From the book it seems much more like the American public is being taken advantage of by the prescription fulfillment from pharmacy networks rather than subsidising anything for the rest of the world.

> Today, approximately 80% of Americans get their medications through a PBM.2 American businesses financing the coverage and the employees paying for their medications are usually oblivious to the price gouging. When people get frustrated that drug prices keep going up, they often point the finger at pharma bad boys like Martin Shkreli. More often, though, the price spikes are taking place right under their noses.

> If we could slash the spread, it would make a tremendous difference for thousands of businesses. According to a recent analysis in the journal Health Affairs, reducing generic reimbursement by $1 per prescription would lower health spending by $5.6 billion annually.

> Health insurance companies direct their business to their own PBMs, which increases their margins. For example, OptumRx, one of the big three PBMs, is owned by America’s largest health insurance company, UnitedHealth Group. Insurers may offer less expensive health insurance premiums. But then they use their PBM to achieve a greater profit margin.

> The PBM Express Scripts is now owned by the insurance company Cigna, and as I write this book, a merger between the PBM CVS Caremark and the insurer Aetna is being finalized. Together, the big three PBMs—OptumRx, Express Scripts, and CVS Caremark—control approximately 85% of the U.S. market and manage medication benefits for most people in the United States.

paulddraper · 2026-03-17T21:18:50 1773782330

That book is very good, but yes it is US-only.

If you want the international perspective, see "The Price of Global Health" (Schoonveld) or "The Right Price: A Value-Based Prescription for Drug Costs" (Neumann, et al).

The short version is that the high price of drugs in the U.S. is the driving force in drug research.

levocardia · 2026-03-10T18:55:14 1773168914

To be fair to SSI, they were very explicit about their plan: "we are going to take money and not release anything until we one-shot superintelligence."

If you invested in that you knew what you were getting yourself into!

levocardia · 2026-03-05T04:49:13 1772686153

"product market fit is when people are ripping the product out of your hands and everything is breaking constantly" - seems bullish to me

nozzlegear · 2026-03-05T05:16:09 1772687769

Who are you quoting?

levocardia · 2026-03-02T02:22:53 1772418173

Also, decision trees (but not their boosted or bagged variants) are easy (well, easy-ish) to port manually to an edge device that needs to run inference. Small vanilla NNs are as well, but many other popular "classical" ML algorithms are not.

bawis · 2026-03-03T08:21:08 1772526068

>> but many other popular "classical" ML algorithms are not

Examples ?

levocardia · 2026-03-02T02:19:41 1772417981

One of the ML textbooks (ESL maybe?) I read described decision trees as (paraphrasing) "really great - they are interpretable, fast to fit, work on lots of different types of data and outcomes, insensitive to scaling and distributional issues, don't have too many tuning parameters...except they just don't work very well." That latter problem can be solved with bagging or boosting, though you are bargaining away many of the other advantages.

levocardia · 2026-02-26T23:46:34 1772149594

You, using normal Claude under the consumer ToS, cannot use it to make weapons, kill people, spy on adversaries, etc. The Pentagon, using War Claude, under their currently-existing contract, can use it to make weapons and spy on (foreign) adversaries, but not to (autonomously) kill people. I don't love this but I am even less excited about the CCP having WarKimi while we have no military AI.

michaelsshaw · 2026-02-27T08:50:44 1772182244

Why be so worried about when the US is clearly the belligerent state that strikes others with impunity while China does no such thing?

levocardia · 2026-02-26T23:42:51 1772149371

Right - for the same reasons a Waymo is safer than a human-driven car, an autonomous fighter drone will ultimately be deadlier than a human-flown fighter jet. I would like to forestall that day as long as possible but saying "no autonomous weapons ever" isn't very realistic right now.