Research
at PARC in recent years which is intimately related to document
image analysis has included text recognition ("decoding"),
image-based summarization (DimSum), paper-digital interfaces (DataGlyphs,
DataIcons, etc), and token-based compression (DigiPaper). More
distantly related to DIA are our special bookscanner and document-image
repository and service bus architectures. I will touch on all
of these and then go deep on the recognition work, including highlights
of results from the past year and a roadmap for our research plans
for the coming year or so. As active collaborators within the
UC Berkeley Ditial Library Initiative project, we plan to extract
structured scientific data from rare books of interest to the
botanical scholarly community and integrate the results with the
Berkeley `CalFlora' digital-library website.
If
anyone would like to meet with the speaker, please contact Tapas
Kanungo (kanunguo@cfar.umd.edu).
|