I respect the feelings behind the post and I agree with a large part of it. I’m inclined to disagree on a few points made. The core problem is outsiders without taste are showing up in a space where there is a long history of dues paid by the current occupants. But how is taste developed? It’s not innate, unfortunately it’s a product of the long ugly process you are currently witnessing. Think back to the first program you were proud of and judge it with today’s eyes.
"The test will be for logged-in adult users on the Free and Go subscription tiers. Plus, Pro, Business, Enterprise, and Education tiers will not have ads."
Really don’t feel comfortable defending OpenAI in any way because there is a lot to complain about but ads for paid services that cost a lot more than $8 per month is not really an anomaly. Look at Amazon prime, prime video, Hulu, airline flights, any major newspaper subscription, YT premium, etc, etc. I get the annoyance but just cancel the service if you don’t want ads or pay for a tier that isn’t subsidized.
It might be wrong but that’s not really a hallucination.
Edit: to give you the benefit of doubt, it probably depends on whether the answer was a definitive “this does not exist” or “I couldn’t find it and it may not exist”
claude said "I want to be straight with you: after extensive searching, I don't think the exact thing you're describing — a single paper that is obviously garbled/badly translated nonsense with no actual content, yet has accumulated hundreds or thousands of citations — exists as a famous, easily linkable example."
I’ve installed and tested Clawdbot twice and uninstalled it. I see no reason to use this unless it’s with local models. I can do everything Clawdbot can do with Claude Code innately and with less tokens. I found Clawdbot to be rather token inefficient even with Claude max subscription. 14k tokens just to initialize and another 1000 per interaction round even with short questions like, “Hey”. Another concern is there are no guarantees that Anthropic isn’t going to lock down Oauth usage with your Max account like they did with OpenCode.
I am trying so hard to understand wtf people are excited about. I have failed. Claude Code can run over-night or while I'm out.
Clawdbot looks like a great way to set tokens on fire.
There are a lot of people with incentives to hype the AI industry (VCs, founders, CEOs, internet personalities that need clicks, people that sell courses, etc). Last week everyone was hyping Claude Cowork, this week it's Clawdbot. Don't get me wrong I think there are a lot of cool things going on but there is a lot of hype (similar to the original internet bubble).
I fought with Tesseract for quite a while. Its good if high accuracy doesn't matter. Transcribing a book from clean, consistent non-skewed data its fine and an LLM might even be able to clean it up. But for legal or accounting data from hand scanned documents, the error rate made it untenable. Even clean, scanned documents of the same category have all sorts of density and skew anomalies that get misinterpreted. You'll pull your hair out trying to account for edge cases and never get the results you need even with numerous adjustments and model retraining on errors.
Flash 2.5 or 3 with thinking gave the best results.
Thanks. I was surprised that Tesseract had recognized poorly scanned magazines and with some Python library I was able to transcribe two-columns layout with almost no errors.
Tesseract is a cheap solution as it doesn’t touch any LLM.
For invoices, Gemini flash is really good, for sure, and you receive “sorted” data as well. So definitely thumbs up. I use it for transcription of difficult magazine layout.
I think that for such legally problematic usage as companies don’t like to share financial data with Google, it is be better to use a local model.
I put together a spec for this where the entire LLM agent landscape adheres to the "Everything is a file" constraint. It uses the FUSE filesystem in the way described. I also created a possible limitation document to describe some areas where I thought it might be overengineered or locking in technical debt.
reply