Hacker Newsnew | past | comments | ask | show | jobs | submit | more john61's commentslogin

> The distinction between code, config and data is being erased.

This distinction never existed in LISP. Greenspun's tenth rule in action.


The year of the Linux phone in India is coming.


> The argument that computational complexity has something to do with this could have merit but the article certainly doesn’t give indication as to why.

OP says it is because that predicting the next token can be correct or not, but it always looks plausible because that is what it calculates. Therefore it is dangerous and can not be fixed because it is how it works in essence.


I just want to point out a random anecdote.

Literally yesterday ChatGPT hallucinated an entire feature of a mod for a video game I am playing including making up a fake console command.

It just straight up doesn’t exist, it just seemed like a relatively plausible thing to exist.

This is still happening. It never stopped happening. I don’t even see a real slowdown in how often it happens.

It sometimes feels like the only thing saving LLMs are when they’re forced to tap into a better system like running a search engine query.


Another anecdote. I've got a personal benchmark that I try out on these systems every time there's a new release. It is an academic math question which could be understood by an undergraduate, and which seems easy enough to solve if I were just to hammer it out over a few weeks. My prompt includes a big list of mistakes it is likely to fall into and which it should avoid. The models haven't ever made any useful progress on this question. They usually spin their wheels for a while and then output one of the errors I said to avoid.

My hit/miss rate with using these models for academic questions is low, but non-trivial. I've definitely learned new math because of using them, but it's really just an indulgence because they make stuff up so frequently.


I get generally good results from prompts asking for something I know definitely exists or is definitely possible, like an ffmpeg command I know I’ve used in the past but can’t remember. Recently I asked how to something in Imagemagick which I’d not done before but felt like the kind of thing Imagemagick should be able to do. It made up a feature that doesn’t exist.

Maybe I should have asked it to write a patch that implements that feature.


When asking question I use chatgpt only as turbo search engine. Having it double check it's sources and citations helped tremendously.


I find it incredibly useful for information retrieval from dense, archival-like text knowledge. I research cellular networks, and everything on Google/DDG is just fluffy SEO spam, but I find Gemini can reliably hone into the precise subsection out of tens of thousands of dense standards to tell me what 5G should do in a given scenario


There is no difference between "hallucination" and "soberness", it's just a database you can't trust.

The response to your query might not be what you needed, similar to interacting with an RDBMS and mistyping a table name and getting data from another table or misremembering which tables exist and getting an error. We would not call such faults "hallucinations", and shouldn't when the database is a pile of eldritch vectors either. If we persist in doing so we'll teach other people to develop dangerous and absurd expectations.


No it's absolutely not. One of these is a generative stochastic process that has no guarantee at all that it will produce correct data, and in fact you can make the OPPOSITE guarantee, you are guaranteed to sometimes get incorrect data. The other is a deterministic process of data access. I could perhaps only agree with you in the sense that such faults are not uniquely hallucinatory, all outputs from an LLM are.


I don't agree with these theoretical boundaries you provide. Any database can appear to lack in determinism, because data might get deleted, corrupted or mutated. Hardware and software involved might fail intermittently.

The illusion of determinism in RDBMS systems is just that, an illusion. The reason why I used the examples of failures in interacting with such systems that I did is that most experienced developers are familiar with those situations and can relate to them, while the probability for the reader to having experienced a truer apparent indeterminism is lower.

LLM:s can provide an illusion of determinism as well, some are quite capable of repeating themselves, e.g. overfitting, intentional or otherwise.


This seems unnecessarily pedantic. We know how the system works, we just use "hallucination" colloquially when the system produces wrong output.


If the information it gives is wrong, but is grammatically correct, then the "AI" has fulfilled its purpose. So it isn't really "wrong output" because that is what the system was designed to do. The problem is when people use "AI" and expect it will produce truthful responses - it was never designed to do that.


You are preaching to the choir.

But the point is that everyone uses the phrase "hallucinations" and language is just how people use it. In this forum at least, I expect everyone to understand that it is simply the result of next token generation and not an edge case failure mode.


I would have thought to assume that, but given how many on HN throw about how LLM's can think, reason, understand I think it does bear clearly defining some of the terms used.


Other people do not, hence the danger and the responsibility of not giving them the wrong impression of what they're dealing with.


Sorry, I'm failing to see the danger of this choice of language? People who aren't really technical don't care about these nuances. It's not going to sway their opinion one way or another.


It promotes the view that LLM:s are minds.


Yep. All these do is “hallucinate”. It’s hard to work those out of the system because that’s the entire thing it does. Sometimes the hallucinations just happen to be useful.


"Eldritch vectors" is a perfect descriptor, thank you.


> It sometimes feels like the only thing saving LLMs are when they’re forced to tap into a better system like running a search engine query.

This is actually very profound. All free models are only reasonable if they scrape 100 web pages (according to their own output) before answering. Even then they usually have multiple errors in their output.


I like asking it about my great great grandparents (without mentioning they were my great great grandparents just saying their names, jobs, places of birth).

It hallucinates whole lives out of nothing but stereotypes.


[flagged]


Responding with "skill issue" in a discussion is itself a skill issue. Maybe invest in some conversational skills and learn to be constructive rather than parroting a useless meme.


First of all, there is no such thing as "prompt engineering". Engineering, by definition, is a matter of applying scientific principles to solve practical problems. There are no clear scientific principles here. Writing better prompts is more a matter of heuristics, intuition, and empiricism. And there's nothing wrong with that — it can generate a lot of business value — but don't presume to call it engineering.

Writing better prompts can reduce the frequency of hallucinations but frequent hallucinations still occur even with the latest frontier LLMs regardless of prompt quality.


So you are saying the acceptable customer experience for these systems is that we need to explicitly tell them to accept defeat when they can’t find any training content/web search results that matches my query enough?

Why don't they have any concept of having a percentage of confidence in their answer?

It isn’t 2022 anymore, this is supposed to be a mature product.

Why am I even using this thing rather than using the game’s own mod database search tool? Or the wiki documentation?

What value is this system adding for me if I’m supposed to be a prompt engineer?


> What value is this system adding....

https://news.ycombinator.com/item?id=44588383


is this supposed to be some kind of mic drop?


To take a different perspective on the same event.

The model expected a feature to exist because it fitted with the overall structure of the interface.

This in itself can be a valuable form of feedback. I currently don't know of any people doing it, but testing interfaces by getting LLMs to use them could be an excellent resource. Th the AI runs into trouble, it might be worth checking your designs to see if you have any inconsistencies, redundancies or other confusion causing issues.

One would assume that a consistent user interface would be easier for both AI and humas. Fixing the issues would improve it for both.

That failure could be leveraged into an automated process that identified areas to improve.



This is meant to be used in hospitals. Where I live no hospital personal uses phones to manage healthcare data. They have PCs.


MyGNU Health looks to be along the lines of Apple Health and is intended to be used by consumers to monitor vitals and track statistics.


It makes sense to own your own medical data rather than handing it over to big tech/FAANG.


Also to manage it on a device that is not owned by big tech/FAANG

I’d never put my health data on an iPhone or Googlized phone.


You seem to be living in the past. While EHRs are still primarily used from desktop PCs, all of the major ones have native mobile apps now. Clinicians appreciate being able to review patient charts and action alerts while away from a PC cart.


And this would be a white-label Epic MyChart for the particular system with embedding for the inpatient or customer facing connections that should be used

It seems like that could be done with a system shipping their own white-labeled GNU Health app through the App Store


Better to make it a web app, so you don't have to mess with Apple or Google's broken economics.


You're really missing the point. The EHR vendors aren't charging customers for those apps through the Apple or Google app stores so "broken economics" are irrelevant. The app stores are only a distribution mechanism and work fine for that.


I am happy to live in a country that values data safety for critical patient data.


We united coders have the power to change that. We have done it in the past. We are going to do it again.


I have cut my warm water costs by 80% with balcony solar panels. I have a warm water heating pump with 600 W electrical power. My little server turns it automatically on when the solar access power is greater than 540 W (measured by the smart meter). This generates usually enough warm water for our household. Also the solar panels cover to idle power of the house of 50-100 W very easily during daytime. This pays off in a few years and it reduces my carbon footprint and that of my neighbors.


The hybrid (heat-pump & heat-elements, both) water heater I installed 2 years ago has already paid for itself in savings. This design literally pulls the heat out of your conditioned space, providing both cooling and dehumidification (I live in a humid temperate rainforest so win-win). <3% of my annual electric usage goes to my water heater (typically 10%+).

During the brief winter months I just set it to heating elements only, and it behaves like a traditional watertank heater (i.e. doesn't cool house in the winter, using only resistive heating).


How do you connect that?

I assume balcony solar panels provide you with a power socket. How do you connect all the appliances in house to that socket(s)? Isn't it a lot of cabling?


The solar panels are connected to a converter which is connected to the normal grid with a standard plug.


Yes, but where do you live?


Well, the person you are replying to is in a thread about Germany, mentions balcony solar and said "my little server turns it automatically on" (which is how you would construct that sentence in German instead of "turns it on automatically"), so my wild guess would be Germany. ;)


Germany isn't that big, but the difference between Freiburg and Hamburg is very significant in this case I believe


Germany has a pretty consistent climate. Doesn't really matter where you live. Of course, that's an oversimplification, but if you're new to Germany and wonder "oh, what's the weather going to be here?", the answer pretty much is "similar to the rest of the country".

You could then look at a map of France and think, ah, similarly sized country, probably also has a consistent climate, but that's not true. Southern France is very different from Northern France. But Germany's climate is pretty uniform.


I moved from Hamburg (north) to close to Munic (south) and the difference is huge. I can see the blue sky, for example! So much better here.


Yes, there is a difference, you are right. I don't have hard numbers at the moment (typing from the phone) but from looking it up quickly, the sun's intensity varies from about 950 kWh/m² to about 1.200 kWh/m² between north and south Germany. So, what OP described will generally work in any part of Germany.


Point taken! Scanning comments rapidly to move on to actually doing some work today - has its drawbacks.


Austria


The year of the Linux Phone is coming!


There sadly isn't a single viable option for a Linux mobile phone out there.

- Purism runs ancient hardware, charges way too much and has questionable business ethics.

- Pine64 has equally bad hardware but reasonable prices. I don't like the Hong-Kong connection though. Not sure how the security patching environment is in practice.

The only option on the table as I see it is buying from the devil and installing GrapheneOS.



The latest phone from https://wiki.postmarketos.org/wiki/Devices was released in 2021.


It won't be bleeding edge but the same reason people buy laptops with Core2.


> Purism runs ancient hardware

https://puri.sm/posts/the-danger-of-focusing-on-specs/

> charges way too much

https://news.ycombinator.com/item?id=21656355

> questionable business ethics

They retrospectively changed their return policy in order to not get bankrupt. AFAIK everything is find now. I'm a happy owner of Librem 5 btw.


There is also jolla / sailfishos built by ex Nokia engineers. The Russians forked it and are useing it in government / industry.


FuriLabs has shipped a usable device for going on two hardware releases now.

Yes, it currently builds on top of Hallium. Anyone who thinks this should be a sticking point has their head in the sand; the device and effort is how you get a usable ecosystem rolling.


DHH has not completed his desktop Linux quest yet…


Maybe a sufficient number off hackers are offended enough now and contribute to really free platforms, like PostmarketOS or Mobian. There has been great work there in the last years. I think we are not very far away from a really usable free phone, we need device drivers and android emulation / f-droid as long as native apps did not catch up.


Based on my experiences with LLMs and the hype around it, we will need more experienced programmers. Because they will have to clean up the huge mess that will come.


In my experience if you look at what effect democratizing code actually has, this is exactly the case. People are generating code, but that’s always been the easy part. The mess this is going to make, gonna need a lot of mops.


This applies to information production in general.

More and more people are producers of information nowadays. But I think everyone can agree, the quality has declined substantially.

I personally find myself reverting back to textbooks more and more and trusting the information I find on the internet less.


If programmers had any self respect we would refuse to be slop janitors and instead just build competing tools and eat the lunches of AI "coders"

But time and time again programmers continue to demonstrate we have no self respect and we're happy to dance to the tune of capital for money. That means we're definitely going to be downgraded to "slop janitors" in the near future


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: