Hacker Newsnew | past | comments | ask | show | jobs | submit | OtherShrezzing's commentslogin

The speed limit in London is at 20mph primarily due to safety, not emissions concerns. It takes approx 2x the distance to come to a complete stop from 30mph than it does from 20mph.

For the majority of journeys in London, you're sitting at a red light, or transitioning to the next red light. Not a lot of opportunity for sustained 30mph travel. Accelerating up to 30mph so that you can travel the 300 meters, and then stop for 3 minutes serves no benefit to you (because your journey is still predominantly waiting at traffic lights), but reduces safety for you & everyone around you.


Pedestrian mortality at 20mph vs 30mph is also vastly different: ~10% vs ~25% [1]. Also, see the graphs at [2].

[1]: https://pubmed.ncbi.nlm.nih.gov/22935347/

[2]: https://data.bikeleague.org/new-nhtsa-data-speed-data-shows-...


I don't see it as an especially exuberant structure or budget. I've seen larger teams with bigger budgets struggle to maintain smaller applications.

I've contracted into some consultancy teams which you could uncharitably describe as "15 people and $4mn/yr to create one PDF per month".


>Sure, but we are not talking about evaluating your contributions daily. Over a lifetime, people find new ways to provide more value. Life is long, and that is how adapting works.

I can't really take that sentiment to my bank when I default on my mortgage while I retrain though. So although you're correct, across a lifetime, this isn't much of an issue, you're minimising people's very real near-term anxieties here.


I'm not being dismissive or trying to minimize anything, I promise. But most people aren't 'losing their jobs to AI' in the short-term as much as you might think. The layoffs have not been due to AI "taking jobs," but due to companies overhiring during the pandemic and finally having an excuse to lay people off, imho.

There is plenty of time to 'retrain.' You could even do it while you currently have a role. Some people won't be able to; I respect that, and those people will still find jobs.

This is certainly not the first 'period of layoffs' to ever occur, and I am not implying people won't face hard times. They may! But that also won't last forever, and when people get laid off they receive unemployment, which helps in the 'not defaulting on your mortgage' thing. Somehow, people (on average) seem to manage not losing their home every time they get laid off.

The idea that our unemployment rate is about to reach 25-50% in the next 3 years is absurd, imho. (I know you didn't say that, and I'm not trying to construct a strawman. I'm just applying numbers to it because 'very real near-term' is not the phrase I'd use for something that is, in my estimation, still half a decade or more away.)


I don’t think anyone needs to compete with the LLM SOTA to get the benefits of these technologies on-device.

Consumers don’t need a 100k context window oracle that knows everything about both T-Cells and the ancient Welsh Royal lineage. We need focused & small models which are specialised, and then we need a good query router.


We need them for what? Specialized models seem to provide a value comparable to what we've been doing with machine learning for eons, just more inefficient to train and to run.


I think most people are conscious that, irrespective of a founders vision, company morals usually don't survive the MBA-inisation phase of a company's growth.


Depends. Many still reflect the founders vision; even if that vision might have evolved over time.


Can you provide an example of that for an American venture backed corporation older than a decade?


Not the person you're replying to, and I may be wrong about this, but Amazon?

Jeff's original vision was "relentless customer focus" and ...

actually on second thought I'm seeing the argument 'Amazon stopped caring about customers and is in full enshittification mode at this point'.

But maybe Amazon circa ~2010/2015, or Google around 2010 was still pretty close to the original vision of customer service/organizing the world's information.

Or Apple? They're still making nice computers, although not sure they count as VC backed.

Stripe perhaps? Hashicorp?


Well Google‘s vision was to catalog all the world’s data

Apple wanted to make personal computing stable - they were absolutely VC backed

I suppose the original question is vague enough that it could always encompass everything which is founders vision even if the vision changes so it’s like OK well then then there’s nothing really to say that you’re stable too it’s just some whatever the function of the person who started the organization is and even that you could debate


The impact of MBAs might be decreasing..


True. Which is all the more reason for calling bullshit on claims of "doing good" or "having ideals" by anyone building a company that can eventually be ran my MBAs.


I disagree. I run a sudoku site. It’s completely static, and it gets a few tens of thousands of hits per day, as users only download the js bundle & a tiny html page. It costs me a rounding error on my monthly hosting to keep it running. To add an api or hosted mcp server to this app would massively overcomplicate it, double the hosting costs (at least), and create a needless attack surface.

But I’d happily add a little mcp server to it in js, if that means someone else can point their LLM at it and be taught how to play sudoku.


A useful feature would be slow-mode which gets low cost compute on spot pricing.

I’ll often kick off a process at the end of my day, or over lunch. I don’t need it to run immediately. I’d be fine if it just ran on their next otherwise-idle gpu at much lower cost that the standard offering.


https://platform.claude.com/docs/en/build-with-claude/batch-...

> The Batches API offers significant cost savings. All usage is charged at 50% of the standard API prices.


Can this work for Claude? I think it might be raw API only.


I'm not sure I understand the question? Are you perhaps asking if messages can be batched via Claude Code and/or the Claude web UI?


Yes, Claude code.


No


OpenAI offers that, or at least used to. You can batch all your inference and get much lower prices.


Still do. Great for workloads where it's okay to bundle a bunch of requests and wait some hours (up to 24h, usually done faster) for all of them to complete.


Yep same, I often think why this isn’t a thing yet. Running some tasks in the night at e.g. 50% of the costs - there’s the batch api but that is not integrated in e.g. claude code


The discount MAX plans are already on slow-mode.


> I’ll often kick off a process at the end of my day, or over lunch. I don’t need it to run immediately. I’d be fine if it just ran on their next otherwise-idle gpu at much lower cost that the standard offering.

If it's not time sensitive, why not just run it at on CPU/RAM rather than GPU.


Yeah just run a LLM with over 100 billion parameters on a CPU.


200 GB is an unfathomable amount of main memory for a CPU

(with apologies for snark,) give gpt-oss-120b a try. It’s not fast at all, but it can generate on CPU.


But it's incredibly incapable compared to SOTA models. OP wants high quality output but doesn't need it fast. Your suggestion would mean slow AND low quality output.


Set your parameters to make that point then. “Yeah just run a 1T+ model on CPU”


Run what exactly?


I'm assuming GP means 'run inference locally on GPU or RAM'. You can run really big LLMs on local infra, they just do a fraction of a token per second, so it might take all night to get a paragraph or two of text. Mix in things like thinking and tool calls, and it will take a long, long time to get anything useful out of it.


I’ve been experimenting with this today. I still don’t think AI is a very good use of my programming time… but it’s a pretty good use of my non-programming time.

I ran OpenCode with some 30B local models today and it got some useful stuff done while I was doing my budget, folding laundry, etc.

It’s less likely to “one shot” apples to apples compared to the big cloud models; Gemini 3 Pro can one shot reasonably complex coding problems through the chat interface. But through the agent interface where it can run tests, linters, etc. it does a pretty good job for the size of task I find reasonable to outsource to AI.

This is with a high end but not specifically AI-focused desktop that I mostly built with VMs, code compilation tasks, and gaming in mind some three years ago.


Yes, this is what I meant. People are running huge models at home now, I assumed people could do it on premises or in a data center if you're a business, presumably faster... but yeah it definitely depends on what time scales we're talking.


I'd love to know what kind of hardware would it take to do inference at the speed provided by the frontier model providers (assuming their models were available for local use).

10k worth of hardware? 50k? 100k?

Assuming a single user.


Huge models? First you have to spend $5k-$10k or more on hardware. Maybe $3k for something extremely slow (<1 tok/sec) that is disk-bound. So that's not a great deal over batch API pricing for a long, long time.

Also you still wouldn't be able to run "huge" models at a decent quantization and token speed. Kimi K2.5 (1T params) with a very aggressive quantization level might run on one Mac Studio with 512GB RAM at a few tokens per second.

To run Kimi K2.5 at an acceptable quantization and speed, you'd need to spend $15k+ on 2 Mac Studios with 512GB RAM and cluster them. Then you'll maybe get 10-15 tok/sec.


Does that even work out to be cheaper, once you factor in how much extra power you'd need?


How much extra power do you think you would need to run an LLM on a CPU (that will fit in RAM and be useful still)? I have a beefy CPU and if I ran it 24/7 for a month it would only cost about $30 in electricity.


Your skill then just becomes an .md file containing

>any time you want to search for a skill in `./codex`, search instead in `./claude`

and continue as you were.


I see it similar to browser user-agents all claiming to be an ancient version of Mozilla or KHTML. We pick whatever works and then move on. It might not be "correct," but as long as our tools know what to do, who cares?


My repos are littered with agent-specific files containing “treat this other file as if it were this one.” We’re moving so fast on so many fronts, and it seems odd that this is the persistent problem. It doesn’t even help lock folks into one agent, so I’m not clear why the industry hasn’t yet standardized on one project-specific file name yet.


I use Claude pretty extensively on a 2.5m loc codebase, and it's pretty decent at just reading the relevant readme docs & docstrings to figure out what's what. Those docs were written for human audiences years (sometimes decades) ago.

I'm very curious to know the size & state of a codebase where skills are beneficial over just having good information hierarchy for your documentation.


Claude can always self discover its own context. The question becomes whether it's way more efficient to have it grepping and lsing and whatever else it needs to do randomly poking around to build a half-baked context, or whether having a tailor made context injection that is dynamic can speed that up.

In other words, if you run an identical prompt, one with skill and one without, on a test task that requires discovering deeply how your codebase works, which one performs better on the following metrics, and how much better?

1. Accuracy / completion of the task

2. Wall clock time to execute the task

3. Token consumption of the task


It's not about one with skill and one without, but about one with skill vs one with regular old human documentation for stuff you need to know to work on a repo/project, or even more accurate comparison, take the skill and don't load it as a skill and just put it as context in the repo.

I think the main conflict in this thread is whether skills are anything more than just structuring documentation you were lacking in your repo, regardless if it was for Claude or Steve starting from scratch.


well, the key difference is that one is auto-injected into your context for dynamic lookup and the other is loaded on-demand as needed and is contingent upon the llm discovering it.

That difference alone likely accounts for some not insignificant discrepancies. But without numbers, it's hard to say.


Skills are more than code documentation. They can apply to anything that the model has to do, outside of coding.


Musk earns a $1tn payout when Tesla hits $8.5tn dollars.

I expect the next step in this series of moves is to turn Tesla into a SPAC & have it acquire SpaceX, bringing its valuation nearer that 8.5t.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: