LAMP - Media Group - Research - Enhancement of Text in Video

About

People

Research

Publications

Seminars

Presentations

Courses

Document Image Enhancement

Overview

Current solutions to the problem of optical character recognition (OCR) have advanced to the point where recognition rates of ;SPMgt; 99% are common for clean, uniformly formatted text. Unfortunately, the performance of most OCR algorithms degrades very rapidly when even small amounts of noise are introduced into the original document or during the scanning process. In many situations, this increased error rate quickly decreases the return on investment to the point where it is not cost-effective to integrate automated recognition technology solutions. To push this critical point lower and deal robustly with noise, OCR systems often perform some type of image enhancement as a preprocessing step.

Traditional enhancement techniques are applied at the pixel or local level and include, for example, the use of morphological operators to reduce speckle, fill small holes and smooth edges. Enhancement at the symbol level has received much less attention, but may include the identification, normalization and precise segmentation of glyphs, for example. Enhancement at the page level ideally includes the elimination of copier noise and streaks and the identification of higher-level structure.

In this research consider the problem of structure based document image enhancement. We will build on previous work to develop the ability to learn" the symbol classes which appear in a given document. These classes can then used to enhance segmentation and symbol appearance, providing an improved version of the original document for OCR. If knowledge about the OCR system is available, the resulting document can be formatted to optimize the system's performance and statistical information about the learned classes can be provided. The general methodology is alphabet-independent, and depends only on the ability to segment the text into characters.