You are here

gImageReader and Tesseract

12 posts / 0 new
Last post
solanus
solanus's picture
Offline
Last seen: 10 years 3 weeks ago
Joined: 2006-01-21 19:12
gImageReader and Tesseract

Program: gImageReader and Tesseract

License: Open Source

Description: gImageReader is a GUI front end for the Tesseract OCR engine

Website: http://sourceforge.net/projects/gimagereader/ and http://sourceforge.net/projects/tesseract-ocr/

gImageReader is an excellent front end for the Tesseract OCR engine.
Tesseract is an open source OCR engine that converts images into editable text. It is installed onto a system that has Tesseract already installed, which is why this App Request lists both of them.

gImageReader Features

- Open images and PDFs
- Acquire from scanner
- Select the part of the image to recognize
- Support for different recognition languages
- Side by side comparison of source image and output text
- Remove linebreaks in output text
- Supports tesseract 3.0

One challenge is that while it also supports spellcheck, it uses the dictionary from OpenOffice. Possibly could be configured to use the dictionaries in LibreOffice Portable?

bill_gagliardi
bill_gagliardi's picture
Offline
Last seen: 7 months 1 week ago
Joined: 2008-11-05 22:44
Me too...

I would be interested in this, as well... Smile

Bill G.
Frozen St. Paul, MN
land of the frozen mosquito

gluxon
gluxon's picture
Offline
Last seen: 4 years 2 months ago
Developer
Joined: 2008-06-21 19:26
Nice find! I'll consider

Nice find! I'll consider packaging it when school is finally over for me. Smile

Soulmech
Offline
Last seen: 12 years 4 months ago
Joined: 2010-03-03 10:52
This would do wonders for me,

This would do wonders for me, especially since TopOCR Portable no longer exists

SWAG

Voltron43
Voltron43's picture
Offline
Last seen: 3 years 4 months ago
Joined: 2009-10-12 12:52
Test Installers

Okay, I've created test installers and haven't made posts in the forum yet.

https://sourceforge.net/projects/voltronportable/files/gImageReaderPorta...

https://sourceforge.net/projects/voltronportable/files/Tesseract-ocrPort...

Just add the X:\PortableApps\CommonFiles\Tesseract-ocr to the gImageReader configuration.

Voltron43
Voltron43's picture
Offline
Last seen: 3 years 4 months ago
Joined: 2009-10-12 12:52
Both have official postings

I've created two two posts in the beta forum for these applications.

Tesseract-ocr Portable
gImageReader Portable

solanus
solanus's picture
Offline
Last seen: 10 years 3 weeks ago
Joined: 2006-01-21 19:12
Thanks - but couldn't they just be in one package?

Thanks for taking the initiative on this, but I was wondering what the advantage it in having them as separate packages.
Tesseract is a command-line only app - which is the reason why gImageReader is necessary. Since they are interdependent (gImageReader won't work without Tesseract and Tesseract is non-GUI without gImageReader) wouldn't it make sense to package them together, rather than to require users to download each separately?

I made this half-pony, half-monkey monster to please you.

Voltron43
Voltron43's picture
Offline
Last seen: 3 years 4 months ago
Joined: 2009-10-12 12:52
Tesseract-ocr is considered a plugin, in my opinion.

In my opinion, Tesseract-ocr acts as a plugin and should be installed in the CommonFiles directory. This allows Tesseract-ocr to have once instance installed if another application needs to use it.

John T. Haller
John T. Haller's picture
Offline
Last seen: 4 hours 24 min ago
AdminDeveloperModeratorTranslator
Joined: 2005-11-28 22:21
Overall Yes, But Not Here

Realistically, as gImageReader is the only thing using it and it needs it to be of any use, it should be bundled and be one package. CommonFiles is only for broad things used by many apps, think Java or GhostScript. Sometimes we bundle something that would really fit in plugins because it's the only app that needs it. Sometimes because it needs a specific version (like GTK). Here, though, this should be one package with both the main app and Tesseract. If we get to the point where we have a few apps using it, we can revisit it.

Sometimes, the impossible can become possible, if you're awesome!

Voltron43
Voltron43's picture
Offline
Last seen: 3 years 4 months ago
Joined: 2009-10-12 12:52
Thanks, John

John, thanks for the clarification. I can bundle the two packages together and post them to the gImageReader post. What directory structure do you recommend using?

gImageReaderPortable\App\Tesseract-ocr
or
gImageReaderPortable\App\gimagereader\Tesseract-ocr?

Thanks.

John T. Haller
John T. Haller's picture
Offline
Last seen: 4 hours 24 min ago
AdminDeveloperModeratorTranslator
Joined: 2005-11-28 22:21
Your Choice

I'd leave it to you. I think gImageReader\App\Tesseract-ocr may be the better fit, since they are separate pieces. And if we do it as a separate plugin later, we can keep that and not have it bundled (ala the App\Java directory in LibreOfficePortable).

Sometimes, the impossible can become possible, if you're awesome!

Voltron43
Voltron43's picture
Offline
Last seen: 3 years 4 months ago
Joined: 2009-10-12 12:52
Tesseract-ocr Portable Outdated & gImageReader Portable Updated

I've updated gImageReader Portable to include Tesseract-ocr and Tesseract-ocr Portable has been outdated.

Log in or register to post comments