Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for this. I tried using Tesseract over the weekend to extract text from a game screenshot and had no luck. The documentation for Tesseract is rather opaque; maybe I'll have better luck with Ocropus.


I wouldn't say that Ocropus is well-documented (this blog post was partially intended to address that). But it's at least written in easily hackable Python, whereas Tesseract is 30 year old C/C++.


My main gripe with tesseract is how convoluted and lacking in documentation the training procedure is, which is critical to getting better results. I'll be sure to check out ocropus.


You'll enjoy my follow-up post then, which talks about training: http://www.danvk.org/2015/01/11/training-an-ocropus-ocr-mode...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: