Hacker Newsnew | past | comments | ask | show | jobs | submit | D-Machine's favoriteslogin

I took a long break from document processing after working on it heavily 20 years ago. The tools I used before were ABBYY FineReader and PrimeOCR. I haven't tried any of the commercial cloud based solutions. I'm currently using GLM-OCR, Chandra OCR, and Apple's LiveText in conjunction with each other (plus custom code for glue functionality and downstream processing).

Try just GLM-OCR if you want to get started quickly. It has good layout recognition quality, good text recognition quality, and they actually tested it on Apple Silicon laptops. It works easily out-of-the-box without the yak shaving I encountered with some other models. Chandra is even more accurate on text but its layout bounding boxes are worse and it runs very slowly unless you can set up batched inference with vLLM on CUDA. (I tried to get batching to run with vllm-mlx so it could work entirely on macOS, but a day spent shaving the yak with Claude Opus's help went nowhere.)

If you just want to transcribe documents, you can also try end-to-end models like olmOCR 2. I need pipeline models that expose inner details of document layout because I need to segment and restructure page contents for further processing. The end-to-end models just "magically" turn page scans into complete Markdown or HTML documents, which is more convenient for some uses but not mine.


There are diminishing returns to further optimization of lower-climate-impact meat sources. Look at greenhouse gas emissions per 100 grams of protein in various foods:

https://ourworldindata.org/grapher/ghg-per-protein-poore

Beef is really high at 48.89 kg CO2e, but pork is only 7.6 kg CO2e. Farmed fish is 5.98 and poultry is 5.7. If you can get people to switch from high-climate-impact meat to low-climate-impact meat, you've already reaped most of the possible climate gains from dietary change. To meet a given protein consumption target, you cut 88% of the emissions by getting protein from chicken instead of beef. Trying to get people to eat unfamiliar and potentially "icky" protein sources after they've already switched to chicken can only produce minor gains.

Though most people are reacting to the headline about how humans could eat maggots, the article says that these maggots are actually being fed to chickens, farmed fish, and other animals. That approach reduces waste streams, slightly reduces the already-modest climate impact of farmed fish and poultry, and doesn't have the enormous uphill battle toward regulatory and consumer acceptance that direct human consumption would face.


https://www.anthropic.com/research/how-ai-is-transforming-wo...

see "How much work can be fully delegated to Claude?": "Although engineers use Claude frequently, more than half said they can “fully delegate” only between 0-20% of their work to Claude"

There won't be anything like you're asking for, even the vendors themselves (they'll be the most positive and most enthousiastic about using it) can't do this with them.


> I'm trying to determine what programming tasks are not in this list. :) I think it is trying to exclude adding new features and fixing bugs in existing code

Yes indeed, these are the things on the other hand which aren't working well in my opinion:

- large codebase

- complex domain knowledge

- creating any feature where you need product insights

- tasks requiring choices (again, complexity doesn't matter here, the task may be simple but require some choices)

- anything unclear where you don't know where you are going first

While you don't experience any of these when teaching or side projects, these are very common in any enterprise context.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: