Agree but you are talking about a POC, and he is talking about reliable, working software.
this phase of LLM are perfect for POCs and there you can have 10x speedup, no question.
But going from a POC to a working reliable software is where most of our time is spent anyway even without LLMS.
With LLMs this phase becomes worse.
we speedup 10x the poc time, we slow down almost as much in the next phases, because now you have a poc of 10k lines that you are not familiar with at all, that have to pay way more attention at code review,
that have to bolt on security as an afterthought (a major slowdown now, so much so that there are dedicated companies whose business model has become fixing Security problems caused by LLM POCs).
Next phase, POCs are almost always 99% happy path. Bolt on edge case as another after thought and because you did not write any of those 10k lines how do you even know what edge cases might be neccesary to cover? maybe you guessed it rigth, spend even more time studing the unfamiliar code.
We use LLM extensivly now in our day to day, development has become somewhat more enjoyable but there is, at least as of now, no real increase in final delivry times, we have just redestributed where effort and time goes.
At our company we use AI extensively to see if we missed edge cases and it does a pretty good job in pointing us towards places which could be handled better.
I know we all think we are always so deep into absolutely novel territory, which only our beautiful mind can solve. But for the vast majority of work done in the world, that work is transformative. You take X + Y and you get Z. Even with brand new api, you can just slap in the documentation and navigate it in order of magnitude faster than without.
I started using it for embedded systems doing something which I could literally find nothing about in rust but plenty in arduino/C code. The LLM allowed me to make that process so much faster.
then you are misunderstaing the downvoting. it's not that the fact that they are burning money. it's the fact that this cost today 20k but that is not the real cost if you factor the it is losing money on this price.
So Tomorrow when this "startup" will need to come out of their money burning phase, like every startup has to sooner or later, that cost will increase, because there is no other monetising avenue, at least not for anthropic that "wilL never use ads".
at 20k this "might" be a reasonable cost for "the project", at 200k it might not.
According to that article, the data they analyzed was API prices from LLM providers, not their actual cost to perform the inference. From that perspective, it's entirely possible to make "the cost of inference" appear to decline by simply subsidizing it more. The authors even hint at the same possibility in the overview:
> Note that while the data insight provides some commentary on what factors drive these price drops, we did not explicitly model these factors. Reduced profit margins may explain some of the drops in price, but we didn’t find clear evidence for this.
What in the world would the profit motive be to “make it appear” that inference cost is declining? Any investors would have access to the real data. End users don’t care. Why would you do the work for an elaborate deception?
Is someone forcing you to work at a company chasing the latest fad?
You go and choose the type of company you want.
The keyword here being "Choice".
There are plenty of workers that would not mind carring a little more risk in those companies for better pay, and those companies could offer better pay if not made to jump through loops.
Classic Example is the "Consultant" contracts.
Companies are paying through their nose to hire "consultants" because it's the easiest way they can try a new idea that might not work.
Company A pays Company B 1k/day for that "consultant", Company B does not have enough capacity so gets that "Consultant" from Company C for 700/day. Company C does the same and gets the consultant from Company D for 500/Day. Company D actually employs the consultant and pays him 200/Day.
In this whole useless chain build to avoid the "can not fire even if your projects doesn't work" Both Company A and the actual Employee are losing big, Both would be much better having a direct contract for 500/day with the understading that this might not work after all.
Notice that for that Employee, stability is not there even now. yes he continues to be employed by Company D When Company A stops the contract but he is effectivly moved to another Contract with Company X. he is effectivly changing Jobs, new reposnsabilities, new collegues, new rules etc. only in the eyes of the state he is not changing jobs.
You are thinking of the continuity of work, colleagues, and responsibilities. Those are of course good to have, especially from the perspective of the company.
But from the perspective of the individual worker, a much more important continuity is the one of salary, insurance, and pension. This is the stability that the employee continues getting at Company D.
No, What i'm thinking is Having the option to choose Salary Stability or Salary Risk.
In that scenario if the employee is fired after just 2 years (worse case) because the project does not work, he is still financially better (compared to what he was getting at the "stable salary") even if it takes him 2 full years to find another job (something very unlikely if there is a dynamic employment market).
in the current scenario of contract works, the employee is getting all the negative effects of changing a job without any upside.
of course I am not talking about complete US style firing, but something in the middle. The option to fire with adequate notice let's say 3 months and adequate compensation, let's say 3 months of compensation after finishing your time.
This way, the employee has 6 months of job Hunting, (I think that is a sweet spot to make it reasonable for both the company and the employee)
You are free to choose opening a consultancy yourself and take on the salary risk as well as reap the rewards.
Companies also currently aren't completely banned from firing people. If and when they do, it can look like you described, with adequate notice and compensation.
If developers are using Claude code with it's quirks, Anthropic controls the backend LLM.
If developers are using OpenCode, it's easy for developers to try different LLMs and maybe substitute it (temporarily or permanently).
In an enterprise market, once they choose a tool they tend to stay with that even if it is not the best, the cost and timeframe of changing is too high.
if developers could swap LLMs freely on their own tool that is big missed opportunity for Anthropic. Not a User friendly move, but the norm in Enterprise.
Right now, most enterprises are experimenting with different LLMs and once they chose they will be locked for a long time.
If they cant can't chose because their coding agent doesn't let them they be locked to that.
LLMs have read EVERYTHING yes. that includes a lot of not optimal solutions, repeating mantras about past best practices that are not relevant anymore, thousands of blog posts about how to draw an owl by drawing two circles and leaving the rest as an exercise to the reader etc.
The value of a good engineer is his current-context judgment. Something that LLMs can not do Well.
Second point, something that is being mentioned occasionally but not discussed seriously enough, is that the Dead Internet Theory is becoming a reality.
The amount of good, professionally written training materials is by now exhausted and LLMs will start to feed on their own slop.
See How little the LLM's core competency increased in the last year even with the big expansion of their parameters.
Babysitting LLM's output will be the big thing in the next two years.
>How is that different from how it worked without LLMs? The only difference is that we can now get a failing product faster and iterate.
The difference is that there is an engineer in the middle who can judge if the important information is provided or not as input.
1. for a LLM "the button must be blue" has the same level of importance as "the formula to calculate X is..."
2. failing faster and iterating is good thing if the parameters of failing are clear which is not always the case with vibecoding, especially when done by people with no prior experience in developing. plenty of POCs build with vibecoding have been presented with no aparent failure in their happy path but with disastrous results in edge cases or with disastrous Security etc.
3. where previously, familairity with the codebase and especially the "history of changes" gave you context about why some workarounds were put into place, these are things that are lost to a LLM. Vibecoding a change to an existing system risks removing those "special workarounds" that keep in mind much more than the current context of the specifications or prompt.
> 1. for a LLM "the button must be blue" has the same level of importance as "the formula to calculate X is..."
You can divide those into two prompts though, there is no point for the LLM to work on both features at the same time. This is why iterative is so useful (oh, the button should be blue, ... and later, the formula should be X).
> 2. failing faster and iterating is good thing if the parameters of failing are clear which is not always the case with vibecoding, especially when done by people with no prior experience in developing. plenty of POCs build with vibecoding have been presented with no aparent failure in their happy path but with disastrous results in edge cases or with disastrous Security etc.
This isn't about vibecoding. If you are vibecoding, then you aren't developing software, you are just wishing for good code from vague descriptions that you don't plan to iterate on.
> 3. where previously, familairity with the codebase and especially the "history of changes" gave you context about why some workarounds were put into place, these are things that are lost to a LLM. Vibecoding a change to an existing system risks removing those "special workarounds" that keep in mind much more than the current context of the specifications or prompt.
LLMs can read and write change logs just as well as humans can (LLMs need change logs to do updates, you can't just give it a changed dependency and expect the LLM to pick up on the change, it isn't a code generator). Actually, this is my current project, since a Dev AI pipeline needs to read and write change logs to be effective (when something changes, you can't just transmit the changed artifact, you need to transmit a summary of the change as well). And again, this is serious software engineering, not vibecoding. If you are vibecoding, I have no advice to give you.
> LLMs can read and write change logs just as well as humans can (LLMs need change logs to do updates, you can't just give it a changed dependency and expect the LLM to pick up on the change, it isn't a code generator). Actually, this is my current project, since a Dev AI pipeline needs to read and write change logs to be effective (when something changes, you can't just transmit the changed artifact, you need to transmit a summary of the change as well). And again, this is serious software engineering, not vibecoding.
This is the important part of the post to which you replied and remains unaddressed:
The difference is that there is an engineer in the middle
who can judge if the important information is provided or
not as input.
The engineer decide what information to use as input to the update prompt. They don’t need to be in the middle of anything, it’s basically the level they are coding at.
> The engineer decide what information to use as input to the update prompt. They don’t need to be in the middle of anything, it’s basically the level they are coding at.
LLMs do not possess the ability to "judge if the important information is provided or not as input" as it pertains to the question originally posed:
How is that different from how it worked without LLMs?
Working without LLMs involves people communicating, hence the existence of "an engineer in the middle", where middle is defined as between stakeholder requirement definition and asset creation.
So you engineer the prompt. I’m still confused what the problem is, I’ve already stated that I’m not talking about vibe coding where the LLM somehow magically figures out relevant information on their own.
> So you engineer the prompt. I’m still confused what the problem is ...
The problem is stakeholders are people and they define what problems are needed to be solved. For those tasked to do so requires understanding of the given problems. Tooling (such as LLMs) does not possess this type of understanding as it is intrinsic to the stakeholders (people) whom have defined it. Tools can contribute to delivering a solution, sure, but have no capability to autonomously do so.
For example, consider commercial dish washing machines many restaurants use.
They sanitize faster and with greater cleanliness than manual dish washing once did. Still, there is no dish washing machine which understands why it must be used instead of not. Of course, restaurant stakeholders such as health inspectors and proprietors understand why they must be used.
As far as the commercial dish washer is concerned, it could just as easily be tasked with cleaning dining utensils as it could recycled car parts.
rerouted? it was a completly different destination, much further than he was originally. there is nothing "noone can do" about stupid burecracies like "can't stop at this station because we are not registered". First they had time to register the stop when they changed the itinery, Second if they failed that somehow, and most probably because of "there was no manual how to do it", in a sittuation like these, stupid rules like that should go out of the window and the passengers be let off as soon as possible and not 60km away. Somehow they can be flexible with the people's time but not with their stupid checklists.
The key explanation for failing to stop at the station is that the train was on the wrong track.
> "Apparently we were not registered at Troisdorf station, so we are on the wrong tracks"
Many stations have a 4 track system: a left track and right track which are adjacent to platforms, and 2 tracks in the middle, which are designed for non-stopping trains.
If the train was on the middle track, stopping would introduce risk and disruption by slowing/stopping the other trains travelling on the high-speed non-stopping line, and also endanger passengers who would have to dismount at height from the train onto an active track, cross the active track, and climb up to the platform.
Once the train was routed onto the incorrect track, correcting it was likely to be impractical (infrequent track transfer points) and stopping on the high-speed track would would be excessively disruptive and dangerous.
It was extremely easy for that train to stop in Bonn-Beuel, which is anyway far superior to Troisdorf for a train that was originally scheduled to stop at Bonn Central Station. Failing to stop there shows perfectly how little DB cares about its passengers.
Lame excuse. There has to be a better alternative than to take them an hour in the wrong direction. Stop and shunt to the right track. Stop at one of those other 15 stations they skipped. But the best would be to simply avoid these kind of unnecessary errors in the first place.
This is so typical for German bureaucracy. I used to work at a German university, which included teaching. I once had a small group of students who collectively plagiarized their coursework. Some of the group admitted doing so. I went to the examination office asking them to enforce the punishment for plagiarism (which would increasingly severe with the number of offenses).
They simply told me: this behavior ought to be punished. Which is a euphemism for but I'm not going to do it. They didn't want the hassle of potentially dealing with one out of many students filing a complaint or worst-case go to Karlsruhe (Germans know what that means). Which is exemplary of German bureaucracy, nobody wants to make decisions and carry responsibility.
I love Germany, but this is really something they need to fix going forward, because it stifles society and the economy in many ways.
Videos produce benefits (arguably much less now with the AI generated spam) that are difficult to reproduce with other less energy hungry ways. compare this with this message that it would have cost nothing to a human to type instead of going through the inference of AI not only wasting energy for something that could have been accomplished much easier but removing also the essence of the activity. No-One was actually thankful for that thankyou message.
a bit tired of auto industry's "just in time" supply managment. they had the same problem when covid closed everything down and now 5 years later they still have not learned that they cant just order "enough for 1-2 months of production" and not more. It's not like the parts change in 2 months.
more like "it's bad to fry the planet so we will destroy our economy for 0.001% impact while the real impacters continue to advance and leave us in the dust"
With LLMs this phase becomes worse. we speedup 10x the poc time, we slow down almost as much in the next phases, because now you have a poc of 10k lines that you are not familiar with at all, that have to pay way more attention at code review, that have to bolt on security as an afterthought (a major slowdown now, so much so that there are dedicated companies whose business model has become fixing Security problems caused by LLM POCs). Next phase, POCs are almost always 99% happy path. Bolt on edge case as another after thought and because you did not write any of those 10k lines how do you even know what edge cases might be neccesary to cover? maybe you guessed it rigth, spend even more time studing the unfamiliar code.
We use LLM extensivly now in our day to day, development has become somewhat more enjoyable but there is, at least as of now, no real increase in final delivry times, we have just redestributed where effort and time goes.