Logo Image Access

Using Zonal and Background OCR

This FAQ describes how to use the basic zonal functionality included in Scan2Net as well as how to purchase, install and use the background OCR option in ScanWizard. The intended target audience is the operator and the administrator of the scanner at the customer site.

Image Access scanners always include a PC running under Linux. The Scan2Net® software includes a basic zonal OCR functionality and has now been enhanced with an OCR module based on the Tesseract OCR engine and the layout analysis software Leptonica. This enhancement, known as the ?background OCR? option can be purchased through our portal https://portal.imageaccess.de.

The Tesseract engine is known to be one of the best OCR software engines and currently supports more than 100 languages, including many from Asia. Many language packages can be downloaded for free from our portal and can all be installed on the target scanner. The engine works best however, if only one or a maximum of two languages are activated at the time of reading.

The OCR engine runs in the background at a low priority level and uses all remaining computing power. The multitasking software will not slow down the scanning process or any other processes. If OCR is enabled the user scans as always and depending on the size of the document and the speed of the operator, it will OCR page by page and may not keep up at certain stages. The software shows the progress on each page and at the end it may take a couple of extra seconds to complete if it did not keep up from the beginning.

This FAQ describes both the zonal OCR single scan functionality as well as the optional background OCR for OCR processes in job mode. Since background OCR is an option, the purchase and installation processes are also described here. The intended target audience is the operator and the administrator of the scanner at the customer site.

PDF Version