Tesseract icon
Tesseract icon

Tesseract

Tesseract.js is a javascript library that gets words in almost any language out of images.

Tesseract screenshot 1

Cost / License

  • Free
  • Open Source

Application type

Platforms

  • Mac
  • Windows
  • Linux
-
No reviews
110likes
2comments
0news articles

Features

Suggest and vote on features
  1.  OCR

 Tags

Tesseract News & Activities

Highlights All activities

Recent News

No news, maybe you know any news worth sharing?
Share a News Tip

Recent activities

Show all activities

Tesseract information

  • Developed by

    Unknown
  • Licensing

    Open Source and Free product.
  • Alternatives

    24 alternatives listed
  • Supported Languages

    • English

Our users have written 2 comments and reviews about Tesseract, and it has gotten 110 likes

Tesseract was added to AlternativeTo by Akasam on and this page was last updated .

Comments and Reviews

   
 Post comment/review
Top Positive Comment
tylerszabo
5

In terms of OCR this tesseract is fantastic. I compared it to ABBYY 14 and tesseract had fewer errors on dictionary words. While it doesn't offer layout preservation with the OCR (i.e. converting into an editable document that should print similarly) you'll likely make up for that in the reduced time needed to fix OCR errors.

For handling PDFs you'll need to convert them to an image file, first - pdftopng (an Open Source tool that can be found in the Xpdf project)

TBayAreaPat
-3

Requres Java be installed On Windows, this also appears to be Command Line now, no console as shown. Links in new version program folder are old and redirect. Readme didn't work. Confusing. Try this before you downvote.

Featured in Lists

A list with 809 apps by AmileyaRyver without a description.

List by AmileyaRyver with 809 apps, updated

What a adobe creative cloud FOSS alternative(including Discontinued Apps and linux)? Well there is not a full suite …

List by moonstone with 17 apps, updated

What is Tesseract?

Tesseract.js is a javascript library that gets words in almost any language out of images.

The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images. There are language files for many languages, even for text set in Fraktur and blackletter typefaces.