Application: gImageReader
Category: Office
Description: gImageReader is a graphical GTK frontend to tesseract-ocr.
Download gImageReader Portable 2.91 Development Test 1 [21.7MB download / 94.5MB installed]
(MD5: 7b8f5f9cd955506becd545148780f197)
Warning: Updating from 0.9 to 2.91 will erase all previously installed spelling dictionaries and tesseract language definitions.
Spelling and Language file Installation Directions:
- Perform the the installs of the spelling and language files after running gImageReaderPortable the first time.
- Install spelling dictionaries but puting them into the
Data\dicts
folder. - Install tesseract language definitions by puting them into the
Data\tessdata
folder.
Release Notes:
2.91 Development Test 1 (2014-03-07):
- gImageReader has been updated to 2.91 Change log
- Tesseract-ocr is now integrated into gImageReader.
- Settings are now stored in the registry instead of a config file.
0.9 Development Test 2 (2012-07-08):
- Tesseract-ocr is now included in the package.
- Language has been changed to English.
- CustomCodePostInstall.nsh has been created to automatically fill in the Tesseract-ocr paths in gImageReaderPortable's settings if the install is new.
0.9 Development Test 1 (2012-07-03): Initial release
Thank you very much for posting this - did a really good job of converting a PDF I've been meaning to do for some while.
Thanks for this. I have wanted to get a product that would do OCR when I didn't have access to Adobe Acrobat at work. This fits the bill and worked very well on the files I threw at it.
One thing I did have a bit of problem with was figuring out where to put the dictionary files after initial install. I eventually found the folder ..\PortableApps\gImageReaderPortable\App\gimagereader\share\myspell\dicts and the Readme.txt therein, but it was a bit cumbersome to find.
Maybe adding a note to the PAC help file or a Readme in the base \gImageReaderPortable would be more user friendly, as the process is different from what gImageReader's Help file indicates.
Still, thanks for the program!
I'll definitely look into documenting this. Thanks for the suggestion.
Can someone confirm they've used this program with a language packaged other than the pre-installed English?
Everytime I try to read an image with any language other than the English one I get an error saying tesseract.exe has stopped working. I've tried all kinds of resolutions, etc. It just won't work. I download the language packs from:
http://code.google.com/p/tesseract-ocr/downloads/list
Tried extracting the whole package, or just the .traineddata file (found some mixed tips online) to the gImageReaderPortable\App\Tesseract-ocr\tessdata\ folder.
I've tried multiple languages, none of them works. English works every time, even on non-English languages.
I'm running Win7 x64.
EDIT: I've found the problem - I was downloading the latest traineddata files, but since the current portable app is based on an older tesseract version, I needed to download the older traineddata files as well.
No activity since nearly a year, please update.
I'll update it soon.
I'm not going to update the portable version of gImageReader because version 0.9.1 has not been released with a Windows distribution.
Tesseract stops working as soon as I try to recognize a scanned text, even if only a selected paragraph in it.
Error message states:
"Failed to perform recognition.
"tesseract returned:
"This probably is a bug in tesseract, retry using an image at a different resolution or by varying the selected region."
I did this, to no avail.
If some one is wirking with this program, specially in languages other than English, I'll appreciate of what amendments was done to make it work.
The only thing I did, is place the hunspell dictionaries for English in the proper folder.
Eolo
Which version of Tesseract are you using?
Hi.
I am using the versions that is packed within gImageReader Portable 0.9 Development Test 2 pack, accessible at the top of this page. It states "Tesseract-ocr is now included in the package. CustomCodePostInstall.nsh has been created to automatically fill in the Tesseract-ocr paths in gImageReaderPortable's settings if the install is new."
In any case, Tesseract's release notes accompanying the package says "Oct 21 2011 - V3.01"
Do you think, I should install a separate copy of Tesseract?
Eolo
I don't know which dictionary you're using exactly, but I was able to successfully run gImageReader with the US English one.
Here are the steps I took:
en_US.aff
&en_US.dic
files to.\gImageReaderPortable\App\gimagereader\share\myspell\dicts
gImageReaderPortable.exe
Which dictionaries are you using specifically? If you point me to them, I can run a few tests myself. Have you tried running gImageReader on multiple files?
I installed the dictionary files you pointed me to, and they turned out to be the same that I had already in that folder: en_US.aff (63 Kb) and en_US.dic (812 Kb). Nonetheless, I replaced them and started gImageReaderPortable again.
I had no complain at this point, and the dictionaries were acknowledged by the program.
I scanned a two paragraph page in English at 200 dpi and, in a second attempt, at 300 dpi.
I selected English as the language for recognition process (there is nothing to choose from, though)
And started recognition.
After a couple of seconds the system announced that Tesserart had stopped running. Once this dialog is dismissed, a second dialog shows, stating that recognition failed, that it is due to a bug in Tesseract and prompted me to try scanning at diverse resolutions (what I did, fruitlessly).
Eolo
Hmmm. Are you able to successfully run it on on PDF files without the scanning portion? I looked in both Tesseract and gImageReader's bug trackers and I didn't see anything related to scanning. I'd suggest trying a local install of the two programs and try scanning from there. If you still get an error, I'd assume that it's with the programs themselves and not the portable version.
Thanks for your Efforts but could you please update the Package to the newest Release?
Working on it. Thank you for the notification that there is a new update.
2.91 Development Test 1 has been released.
Warning: Updating from 0.9 to 2.91 will erase all previously installed spelling dictionaries and tesseract language definitions.