Tesseract
Pure JavaScript OCR library.
- Free • Open Source
- Mac
- Windows
- Linux
...
Tesseract.js is a javascript library that gets words in almost any language out of images.
The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images. There are language files for many languages, even for text set in Fraktur and blackletter typefaces.
The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images. There are language files for many languages, even for text set in Fraktur and blackletter typefaces.
Show full description ▾
Tesseract
Summary and Relevance
Our users have written 1 comments and reviews about Tesseract, and it has gotten 80 likes
- Open Source and Free product.
- 49 alternatives listed
Popular alternatives
View allTesseract was added to AlternativeTo by Akasam on May 11, 2009 and this page was last updated Nov 18, 2020.
Features Vote on or suggest new features
Comments and Reviews Post a comment / review all • positive • negative relevance • date
Said about Tesseract as an alternative
samsuffit GImageReader is a very good tool that use in fact the 'Tesseract' javascript library
Category
Office & ProductivityLists containing Tesseract
Tags
- text-recognition
Tesseract
Summary and Relevance
Our users have written 1 comments and reviews about Tesseract, and it has gotten 80 likes
- Open Source and Free product.
- 49 alternatives listed
Popular alternatives
View allTesseract was added to AlternativeTo by Akasam on May 11, 2009 and this page was last updated Nov 18, 2020.
In terms of OCR this tesseract is fantastic. I compared it to ABBYY 14 and tesseract had fewer errors on dictionary words. While it doesn't offer layout preservation with the OCR (i.e. converting into an editable document that should print similarly) you'll likely make up for that in the reduced time needed to fix OCR errors.
For handling PDFs you'll need to convert them to an image file, first - pdftopng (an Open Source tool that can be found in the Xpdf project)