Tesseract and Tess4J

Tesseract is an OCR library best known to be maintained by Google teams. Thankfully there’s a Java wrapper that allows to combine this powerfull functionality to Selenium or whatever needs such technology. I already know about Sikuli, and I’m stunned with such great open source libraries.

The italicize text on the picture comes from another Selenium test, Asprise OCR. That other OCR gives messy results like:

Never M2suse the O, ne
Who Likes You
Never Say Busy To Th,e One
Who Needs You
Never cheat The One
Who ReaZZy Trust You,
Never foJnget The One
Who Zways Remember You.

Here Tess4j gives a perfect:

Never Misuse the One
Who Likes You,
Never Say Busy To The One
Who Needs You,
Never cheat The One
Who Really Trust You,
Never Forget The One
Who Always Remember You.

Creator of tool jSQL Injection, coding junkie

Tagged with: , , , , ,
Posted in Java
3 comments on “Tesseract and Tess4J
  1. SutoCom says:

    Reblogged this on Sutoprise Avenue, A SutoCom Source.
    — I enjoy exchanging links to farming sites, crap! — edited by ron190

  2. SD says:

    Tesseract instance = Tesseract.getInstance(); getInstanc shows as deprecated and the code throws an exception :

    Exception in thread “main” java.lang.NoSuchFieldError: RESOURCE_PREFIX
    at net.sourceforge.tess4j.util.LoadLibs.(Unknown Source)
    at net.sourceforge.tess4j.TessAPI.(Unknown Source)
    at net.sourceforge.tess4j.Tesseract.init(Unknown Source)
    at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
    at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
    at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
    at TesseractExample.main(TesseractExample.java:21)
    any insights?

  3. Vince says:

    Yes It’s good, but the picture have to be clear else the results are not good.

Leave a comment