Perhaps
the greatest versatility demonstrable today by computer vision
technology occurs in the interpretation of images of unconstrained
multilingual machine-printed documents. I review the state of
the art with particular attention to the analysis of complex unoriented
page layouts, classification of symbols, and exploitation of linguistic
contexts. Classifiers, to be useful in this domain, must generalize
strongly across a wide range of writing systems, typefaces, and
image degradations. The talk is illustrated with Chinese, English,
Hebrew, Japanese, Korean, Swedish, Russian, Thai, Tibetan, and
turkish examples, plus a Russian-English dictionary. I conduct
a brief tour of a software system architeture that allows our
largely language-independent page-reader to be rapidly retargeted
to new languages. (This is joint work with David Ittner, Tin kam
Ho, Craig Nohl, Dar-Shyang Lee, and others.)
Henry S. Baird is a Member of Technical Staff, Lucent Technologies,
Murray Hill, New Jersey. His research focuses on the design and
analysis of algorithms for machine vision with emphasis on the
interpretation of images of printed documents. He is an Area Editor
for the journal Computer Vision and Image Understanding. During
1989-91, he was an Associate Editor of IEEE Transactions on Pattern
Analysis and Machine Intelligence. He was chair of the 1996 Symposium
on Document Analysis and Information Retrieval, and was principal
organizer of the 1990 IAPR Workshop on Syntactic and Structural
Pattern Recognition. His Princeton University Ph.D. thesis on
algorithms for image matching won a 1984 ACM Distinguished Dissertation
Award and was published by the MIT Press. In 1976, his Master's
thesis gave the first complete description of the sweep-line algorithm,
now seen as a fundamental technique in computational geometry.
He is a Fellow of the IAPR, a senior member of the IEEE, and a
member of the ACM.
|