OCRfeeder

A Complete OCR Suite
Download

OCRfeeder Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL v3
  • Price:
  • FREE
  • Publisher Name:
  • Joaquim Rocha
  • Publisher web site:

OCRfeeder Tags


OCRfeeder Description

A Complete OCR Suite OCRFeeder is a document layout analysis and optical character recognition system.Given the images it will automatically outline its contents, distinguish between what's graphics and text and perform OCR over the latter. It generates multiple formats being its main one ODT.OCRFeeder features a complete GTK graphical user interface that allows the users to correct any unrecognized characters, defined or correct bounding boxes, set paragraph styles, clean the input images, import PDFs, save and load the project, export everything to multiple formats, etc.Installation on Ubuntu:The only packages needed to be installed on Ubuntu 8.10 is PyGoocanvas and Unpaper, the rest of the dependences are already installed in a fresh install of this version of Ubuntu. The engine Ocrad is also installed for the reasons explained in the previous section.To install PyGoocanvas, Ocrad and Unpaper, the following command should be executed as superuser: apt-get install python-pygoocanvas ocrad unpaperAfter all of the packages finish the installation, OCRFeeder is ready to be installed. To install it, all that is needed is to run setup.py script as superuser: setup.py installOCRFeeder can now be run by calling it from a desktop menu or by running the *ocrfeeder* command. When using the GNOME desktop, if the desktop menu entry is not showing the OCRFeeder's icon, the following command must be used to update the icon cache (as superuser): gtk-update-icon-cache -f -t /usr/share/icons/hicolorCommand Line Usage:This section explains how to use OCRFeeder from the command line.The command line interface part of OCRFeeder aims at users who want to perform quick and unattended conversions of document images to editable formats. It also makes this project usable from other applications.Two parameters are mandatory: 1) the path to each document image to be processed is given after the parameter --images; 2) the name of the document to be generated is given after the parameter --o.For example: ocrfeeder-cli --images ~/image1.png ~/image2.jpeg --o converted_documentThe pages of the generated documents honor the order of the given paths.It is also possible to specify the format of the document to be generated(HTML or ODT) with the option --format. In case no format is specified,the images will be exported to ODT. Continuing with the example above: ocrfeeder-cli --images ~/image1.png ~/image2.jpeg --format HTML --o converted_documentOCRFeeder Studio (the graphical user interface part) can also be launchedfrom the command line. Two options can be used to load images right afterthe program initiates. Those are --images which will add the images givenas the option's arguments and --dir that will add all the images under agiven directory path. The options can be used individually or combined,for example: ocrfeeder --images ~/image1.png ~/image2.jpeg --dir ~/DesktopFor any usage, the options and parameters can be given in any order. Requirements: · Python · PyGTK · PIL · PyGooCanvas · AFPL Ghostscript · Unpaper What's New in This Release: Improvements: · Import PDF files faster · Add option to detect and include system-wide OCR engines on OCR engines manager dialog · Show option to include the detected system-wide OCR engines when the application is started with no engines · Better integration with intltool · Miscellaneous string fixes (thanks to Philip Withnall) · Add man pages · Make translators for about dialog dependent from the translation (thanks to Claudio Saavedra) · Improve Debian package generation (thanks to Alberto Garcia) · Fix Debian packages dependencies · Allow multiple selection of selection areas · Add Ctrl+a shortcut to select all areas · Add "recognize selected areas" action Bug Fixes: · Remove PDF files' extension from the images generated from them · Sort images when adding them from a folder · Select selection area after creating it · Fix problem when quitting the application New and Updated Translations: · Marek Cernocky · Mario Bl?¤ttermann · Philip Withnall · Jorge Gonz??lez · Kjartan Maraas · Matej Urban??i?? · Daniel Nylander


OCRfeeder Related Software