|
Overview
Current solutions to the problem of optical character recognition
(OCR) have advanced to the point where recognition rates of ;SPMgt;
99% are common for clean, uniformly formatted text. Unfortunately,
the performance of most OCR algorithms degrades very rapidly when
even small amounts of noise are introduced into the original document
or during the scanning process. In many situations, this increased
error rate quickly decreases the return on investment to the point
where it is not cost-effective to integrate automated recognition
technology solutions. To push this critical point lower and deal
robustly with noise, OCR systems often perform some type of image
enhancement as a preprocessing step.
Traditional
enhancement techniques are applied at the pixel or local level
and include, for example, the use of morphological operators to
reduce speckle, fill small holes and smooth edges. Enhancement
at the symbol level has received much less attention, but may
include the identification, normalization and precise segmentation
of glyphs, for example. Enhancement at the page level ideally
includes the elimination of copier noise and streaks and the identification
of higher-level structure.
In
this research consider the problem of structure based document
image enhancement. We will build on previous work to develop the
ability to learn" the symbol classes which appear in a
given document. These classes can then used to enhance segmentation
and symbol appearance, providing an improved version of the original
document for OCR. If knowledge about the OCR system is available,
the resulting document can be formatted to optimize the system's
performance and statistical information about the learned classes
can be provided. The general methodology is alphabet-independent,
and depends only on the ability to segment the text into characters.
|