MARCH
9, 1999, 1:00
Huiping Li
LAMP,
University of Maryland
Text Processing in Digital Document and Video Databases
ABSTRACT
|
Text
is an important information source in documents and digital videos.
In this talk we will address the problems of digital text processing
in documents and digital video. The topics include text extraction,
shapecoding and its application to duplicate document detection,
text quality evaluation and text enhancement in document domain
and text identification, tracking, enhancement and recognition
in video domain. The presentation includes the work which has
been done and the proposed work. Steady increases in computational
power and affordable storage have allowed organizations to scan
large numbers of documents into databases with no indexing information.
We developped a shapecode based technique to detect duplicate
document in large databases. The representative text line is extracted
and chosen from binary document images. We use dynamically propogate
xline and baseline to shapecode characters in degraded text lines.
The extracted shapecode is converted to keys to index directly
into database. Another interesting topic is the document quality
evaluation and enhancement. The motivation of this research is
while the Optical Character Recognition (OCR) has advanced to
the point that most commercial OCR softwares can achieve the recognition
accuracy as high as 99.9\% for clean documents, the performance
will degrade rapidly even when a little noise is introduced. We
will discuss the techniques to enhance the text in degraded document
images. We think text enhancement is essential for robust character
recognition as well as other applications. Our proposed method
for text enhancement is more processing oriented than domain oriented.
The basic idea is that we use estimation techniques to evaluate
the document quality and then automatically apply the enhancement
technique to different types of degradations. Text in digital
video can provide important supplemental information for retrieval
and indexing. Compared with document images, text processing in
digital video has some new challenges. We will address new problems
appeared and the techniques we developped to slove them. We developped
a hybrid wavelet/neural network classifier to segment text from
digital video. To find temporal correspondence of text blocks
in consecutive frames, we use a multi-resolution based image matching
to track the detected text. Text contour is used to refine the
matched result. After registration, we use the text blocks in
multiple frames to smooth the background. Image interpolation
techniques is used to raise the textual image resolution. The
experimental results show these techniques can improve OCR result
significantly. Finally, we will discuss the text-based indexing
and retrieval in digital video.
|