LAMP - Language Group

LAMP Seminar
Language and Media Processing Laboratory
Conference Room 4406
A.V. Williams Building
University of Maryland

April 13, 1998, 1:00
Henry S. Baird
Xerox Palo Alto Research Center
baird@parc.xerox.com

Document Image Analysis Research at Xerox PARC

ABSTRACT

Research at PARC in recent years which is intimately related to document image analysis has included text recognition ("decoding"), image-based summarization (DimSum), paper-digital interfaces (DataGlyphs, DataIcons, etc), and token-based compression (DigiPaper). More distantly related to DIA are our special bookscanner and document-image repository and service bus architectures. I will touch on all of these and then go deep on the recognition work, including highlights of results from the past year and a roadmap for our research plans for the coming year or so. As active collaborators within the UC Berkeley Ditial Library Initiative project, we plan to extract structured scientific data from rare books of interest to the botanical scholarly community and integrate the results with the Berkeley `CalFlora' digital-library website.

If anyone would like to meet with the speaker, please contact Tapas Kanungo (kanunguo@cfar.umd.edu).