Login

***rich2005*** · 12-14-2019, 10:26 AM

The problem with a basic Tesseract, is it is command line. Obviously the best way if OCR-ing a whole book. One problem is loss of formatting, tend to get long lines of text with no breaks and no headings etc.

I use it in Linux for small 'screen captured' text images using a GUI (prefer YAGF but not working in 'buntu 18.04 so gImageReader) .

For a screen capture always need some pre-processing in Gimp, scaling up 200% - 300%, clean background etc.

There is a Tesseract for Windows with GUI here: https://ocr.space/blog/p/free-ocr-windows.html

And a quick try-out in a Win10 VM https://i.imgur.com/H7fvKCu.jpg and that is typical, some post OCR corrections needed. Still better than typing out the whole thing Wink

Login
Username:
Password:	Lost Password?
	Remember me