Hacker Newsnew | past | comments | ask | show | jobs | submit | chunky1994's commentslogin

Does anyone use LLMs in such a manner that they believe it always has the most up to date information (without web search tools?).

Isn't this whole thesis negated by the fact that tool calling web search exists? This just feels like a whole lot of words to say, don't treat a LLM as an always up to date infallible statistical predictor.


> Does anyone use LLMs in such a manner that they believe it always has the most up to date information (without web search tools?).

Probably just 95% of the users. You know, the non-techies.



The AI hype and overstatement of capabilities is at least as strong amongst the 'techies' as the people they treat as more credulous than themselves.

without a doubt yes. I'd encourage you to just try a session on a free chatgpt account, asking questions you think a parent or someone unfamiliar with the space would probably ask.

It will not only answer confidently incorrect, but it will not web search in obvious scenarios where it should.

The words here, aren't meant to be a warning for people in this type of community falling victim to this type of thing, its more for the general public that doesn't grasp the tools they are using, the people that wont ever wander across this article.

This i think is a huge reason we really need to jump into LLM basics classes or something similar as soon as possible. People that others consider "smart" will talk about how great chatgpt or something is, then that person will go try it out because that person they respect must be right, they'll hop on the free model and get an absurdly inferior product and not grasp why. They'll ask something that requires a web search to augment info, not get that web search, and assume the confidently incorrect agent is correct.

The thesis is also I think not entirely about not having that modern info at query time, its more scattered. Someone asks what product they should use to mash potatoes, a tool is suggested. Everyone that asks then receives that same recommendation and instead of having a range of different styles of mashing potatoes, we end up all drifting closer towards one style, and the range of variance in how food is prepared is slowly getting lost.


Gemini can be asked about current events. I was quite surprised it was able to give structured information about love boxing event in realtime.

Most agent/chats have access to web search. I’m not overly surprised that it can do it but it is very nice when it actually works.

Most users probably don’t ask themselves the question and simply are unwittingly affected by how the model happens to be wired.

Why do you expect web search tool calls to continue to be useful in the presence of modern AI slop farms, AI-assisted SEO, and search engines largely turning themselves into AI-based question-answering engines?

(At present, Gemini's question-answering capability (which Google kind of makes its users use) seems extremely error-prone -- much worse than competing LLMs when asked the same question.)


I agree with you, this is a huge concern, and we are still in an age where most content on the internet isn't ai generated yet. What about 10 years from now? We have many instances of people writing posts on reddit or uploading videos and blogs using AI generated text. What happens when that is a significant percentage of content?

I recently saw a video discussing a researcher who published a fake scientific article about a fictitious disease, with bogus author names, even a warning IN the article itself that stated "This is not a real disease, this article is not real" (paraphrasing) but still AI ended up picking up this article and serving information from it as if it was a real disease.

It even got cited in papers (which were later redacted of course), but the fact those papers got published in the first place is a serious issue.


> I recently saw a video discussing a researcher who published a fake scientific article about a fictitious disease, with bogus author names, even a warning IN the article itself that stated "This is not a real disease, this article is not real" (paraphrasing) but still AI ended up picking up this article and serving information from it as if it was a real disease.

Isn’t a lot of pretraining done by chopping sources up into short-context-window-sized pieces and then shoving them into the SGD process? The AI-in-training could be entirely incapable of correlating the beginning with the end of the article in its development of its supposed knowledge base.


I don't know, I am not an AI researcher, but if it is done that way, it seems very short sighted (given the things AI is advertised to be able to do)

@dang, this has way more discussion than the previous threa,d but people can't see this because it's a dupe.


Yes, I'm not sure what happened. Restored now.

p.s. @dang doesn't work - I only saw this randomly. For reliable (if sometimes delayed!) message delivery use hn@ycombinator.com.


Their Personal Access Token must’ve been pwned too, not sure through what mechanism though


They have written about it on github to my question:

Trivvy hacked (https://www.aquasec.com/blog/trivy-supply-chain-attack-what-...) -> all circleci credentials leaked -> included pypi publish token + github pat -> | WE DISCOVER ISSUE | -> pypi token deleted, github pat deleted + account removed from org access, trivvy pinned to last known safe version (v0.69.3)

What we're doing now:

    Block all releases, until we have completed our scans
    Working with Google's mandiant.security team to understand scope of impact
    Reviewing / rotating any leaked credentials
https://github.com/BerriAI/litellm/issues/24518#issuecomment...


69.3 isnt safe. The safe thing to do is remove all trivy access. or failing that version. 0.35 is the last and AFAIK only safe version.

https://socket.dev/blog/trivy-under-attack-again-github-acti...


I have sent your message to the developer on github and they have changed the version to 0.35.0 ,so thanks.

https://github.com/BerriAI/litellm/issues/24518#issuecomment...


Does that explain how circleci was publishing commits and closing issues?


Realistically I think it will come down to the aggrieved counterparties here. Who was on the losing side of the money, was it Joe Schmoe day trader or a bunch of funds who lost their shirt?

If it’s the hedge funds or institutional money, you can absolutely be sure this will come to a head. People don’t like being taken for a ride, and if they are repeatedly taken for a ride and they are organized market participants they will come around and make sure there is a comeuppance as a collective


There is credible reporting (Reuters etc.) that ships are being turned around, so it does appear that the mines (or at least threat thereof) have been deployed. Either way, as long as the threat of sinking is alive the strait is uninsurable and is for all practical purposes closed.


> ships are being turned around, so it does appear that the mines (or at least threat thereof) have been deployed

I'd assume, until further evidence, it's because the Strait is an active war zone.


Fair point, but the IRGC telling ships to turn around, as opposed to the ships themselves doing it (as per reporting) would imply that the Strait has been blockaded in some fashion. It remains to be seen if this is all a bluff, I'm just as skeptical as this would be their last option, but given the strikes on other Gulf countries, the threat seems a bit more plausible of actually being real.


That regime has absolutely nothing to lose at this point and they will use whatever they've got.


> The dark side of this same coin is when teams try to rely on the AI to write the real code, too, and then blame the AI when something goes wrong. You have to draw a very clear line between AI-driven prototyping and developer-driven code that developers must own. I think this article misses the mark on that by framing everything as a decision to DIY or delegate to AI. The real AI-assisted successes I see have developers driving with AI as an assistant on the side, not the other way around. I could see how an MBA class could come to believe that AI is going to do the jobs instead of developers, though, as it's easy to look at these rapid LLM prototypes and think that production ready code is just a few prompts away.

This is what's missing in most teams. There's a bright line between throwaway almost fully vibe-coded, cursorily architected features on a product and designing a scalable production-ready product and building it. I don't need a mental model of how to build a prototype, I absolutely need one for something I'm putting in production that is expected to scale, and where failures are acceptable but failure modes need to be known.

Almost everyone misses this in going the whole AI hog, or in going the no-AI hog.

Once I build a good mental model of how my service should work and design it properly, all the scaffolding is much easier to outsource, and that's a speed up but I still own the code because I know what everything does and my changes to the product are well thought out. For throw-away prototypes its 5x this output because the hard part of actually thinking the problem through doesn't really matter its just about getting everyone to agree on one direction of output.


But the world is not deterministic, inherently so. We know it's probabilistic at least at small enough scales. Most hidden variable theories have been disproven, and to the best of our current understanding the laws of the physical universe are probabsilitic in nature (i.e the Standard Model). So while we can probably come up with a very good probabilistic model of things that can happen, there is no perfect prediction, or rather, there cannot be


Dummit and Foote is the classic abstract Algebra textbook to learn about how to precisely define these. Its treatment of ring theory is very well motivated and easy to grasp


I don't think anyone is advocating for incentivizing forced/child labour.

Given that the ILAB link you posted itself is maintained by EO 13126 signed by the Clinton Administration, I think there can be nuance in the discussion around whether or not the blanket application of certain foreign policy instruments is the right way to induce a change in the domestic policy of another country to solve the problem of bad labour practices.

We can do this without it becoming an argument about whether trade is "good" or "bad" depending on what "side" you are on.


This is a good discussion around the supply chain issues that will likely be happening: https://youtu.be/-dgHWv-Dh6Q?t=1370

Ryan runs Flexport which is a supply chain company so its from the "source" if you will.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: