Agenda:
The conference will be spread over three days April
9, 10, and 11th, 2003. Each session will have a government representative
discussing programs or needs of the focused technology. Thursday
afternoon will be devoted to abstracts and demonstrations. The Friday
agenda will have additional talks and government discussions on
potential collaboration and funding sources.
2003
Symposium on Document Image Understanding Technology
Greenbelt
Marriott, Greenbelt Maryland
April
9-11th, 2003
Tentative
Agenda
Wednesday, April 9th
9:00 Welcome
9:10 Page Structure 1
Assuring High-Accuracy Document Understanding: Retargeting,
Scaling Up, and Adapting
Henry Baird, Kris Popat, Thomas Breuel, Prateek Sarkar, Daniel
Lopresti (Palo Alto Research Center)
Automated Data Extraction from Structured Documents
Janusz Wnek (Science Applications International Corporation)
Automated Logo Detection and Recognition
Tom Drayer and Ken Cantwell (Department of Defense)
10:20 Break
10:40 Government Talk
Retrospective of Document Analysis since SDIUT 95
Steve Dennis (Department of Defense)
11:00 Multilingual Documents
Farsi Searching and Display Technologies
Kazem Taghva, Ron Young, Jeff Coombs, Ray Pereda
Russell Beckley, and Mohammad Sadeh
(University of Nevada, Las Vegas)
Porting the BBN BYBLOS OCR System to New Languages
Prem Natarajan, Michael Decerbo, Tom Keller, Rich Schwartz,
and John Makhoul (BBN Technologies)
Segmenting and Tagging Structured Content
Huanfeng Ma, Burcu Karagol-Ayan and David Doermann
(University of Maryland, College Park)
12:00 Lunch
1:00 Keynote Talk Document Processing, and Understanding:
An Integrated Approach, Patrice Simard, Microsoft Research
2:00 Handwriting
A System for Handwriting Matching and Recognition
Sargur Srihari Bin Zhang, Catalin Tamai, Sagnjik Lee, Zhixin
Shi and Yong-Chul Shini (State University of New York/Cedar
Buffalo)
Indexing of Handwritten Historical Documents Recent
Progress
R. Manmatha and T.M. Rath (University of Massachusetts,
Amherst)
Document Categorization Using Latent Semantic Indexing
Anthony Zukas and Robert Price
(Science Applications International Corporation)
Parsing Freeform Handwritten Notes on the Tablet PC
Michael Shilman (Microsoft Research)
3:30 Break
3:40 Government Talk
OCR for Collection, Management, and Retrieval of Documents:
Development and Trial of a Documentation Exploitation Suite
Luis Hernandez and Christian Schlesiger (Army Research
Laboratory)
4:00 Degraded Documents
Processing Noisy Documents
David Doermann and Huiping Li (University of Maryland,
College Park)
Summarizing Noisy Documents
Hongyan Jing, Daniel Lopresti and Chilin Shih (Bell-Laboratories,
Lucent Technologies Inc.)
Rough and Degraded Document Interpretation by Perceptual Organization
Eric Saund, David Fleet, James Mahoney, and Daniel Larner
(Palo Alto Research Center)
OCR Accuracy Prediction as a Script Identification Problem
Vitaly Ablavsky, Joshua Pollak, Magnus Snorrasen
and M. Stevens (Charles River Analytics)
Thursday, April 10th
9:00 OCR and OCR Correction
A Survey of Retrieval Strategies for OCR Text Collections
Steven Beitzel, Eric Jensen and David Grossman
(Illinois Institute of Technology)
OCR Accuracy and Retrievability of Post-Processed Documents
T. Nartker, Kazem Taghva, and Julie Borsack (University
of Nevada, Las Vegas)
Varying Effects of Image Improvement Methods on OCR Accuracy
Kristen Summers (Vredenburg)
OCR Correction Using Historical Relationships from Verified
Text in Biomedical Citations
Susan Hauser, Tehseen Sabir and George Thoma
(National Library of Medicine)
10:20 Break
10:40 Document Analysis Resources
Balanced Query Methods for Improving OCR-Based Retrieval
Kareem Darwish and Douglas Oard
(University of Maryland, College Park)
Creation of Multi-Lingual Data Resources and Evaluation
Tool for OCR
Srirangaraj Setlur, Suryaprakash Kompalli, Ramanaprasad Vemulapati
and Venu Govindaraju
(State University of New York/Cedar Buffalo)
Multilingual OCR Ground Truth from Printed and Web Sources
Mark Turner, Yuliya Katsnelson, and Kristen Summers
(Vredenburg, Inc)
Ground Truth Data for Document Image Analysis
Glen Ford and George Thoma (National Library of Medicine)
12:00 Lunch
1:00 Keynote Talk Overview of the Questioned Document
Unit
Gabriel Watts (Federal Bureau of Investigation)
2:00 Government Talk
Transitioning Experimental HMM OCR: From Lab to Field
Christian Schlesiger, Luis Hernandez, and Michael Lee
(Army Research Laboratory)
The Effects of Document Analysis on Automatic Content Extraction
Jonathan K. Davis (Department of Defense)
An Automation Tool For the Detection of Sensitive Information
Gary DeWitt (Department of Energy)
3:30 Exhibits/Demonstrations and Poster/Abstracts
The Gamera Framework for Building Custom Recognition Systems
Mike Droettboom, Karl MacMillan and
Ichiro Fujinaga (Johns Hopkins University)
3D Methods to Aid Handwriting Analysis and OCR
Anshuman Razdan, John Femiani, Jeremy Rowe (Arizona State
University)
Automated Reading of Free-Form Handwriting in Images, The Past
and One Proposed Future
Joanna Fancy (Higherglyphics)
The Video Spectral Comparator 2000HR
Greggory Mokrzycki (Federal Bureau of Investigation)
Demonstration for Parsing Freeform Handwritten Notes on the
Tablet PC
Michael Shilman (Microsoft, Incorporated)
ABBYY OCR Software
Artur Vassylyev and Ding-Yuan Tang (ABBYY Software House)
Document Layout Anaylsis
Thomas Breuel, Palo Alto Research Center
High View Document Image Management Tool
Mark Turner, Vredenburg
Scansoft Asian Language OCR Capability
Tom D'Errico (ScanSoft)
Groundtruth Image Generation from Electronic Text (Demonstration)
David Doermann and Gary Zi (University of Maryland)
4:30-7:00 Demos, Posters and Exhibits
An informal reception will be held during the Exhibit Session
with Food and Drink.
Friday, April 11th
9:00 Page Structure 2
High Performance Document Layout Analysis
Thomas Breuel (Palo Alto Research Center)
Automated Layout Recognition
Lynn Golebiowski and Alan Sakakihara, Booz Allen Hamilton
Automatic Forms Processing in the NIST Forms DAtabase Document
Image Understanding Technology 2003
Carson Cumbee (Department of Defense)
Amplifying Accuracy through Style-Consistency
Prateek Sarkar and Thomas Breuel (Palo Alto Research
Center)
A Generative Probabilistic OCR Model
Okan Kolak, Philip Resnik and William Byrne, University of
Maryland, College Park and The Johns Hopkins University)
10:40 Break
11:00 Multimedia:
Form Analysis with the Nondeterministic Agent
Tom Henderson and Lavanya Swaminathan (University of
Utah)
Metrics for Evaluating the Performance of Video Text Recognition
Systems
Greg Myers (Stanford Research Institute)
Universal Document Management System for the Mobile Warrior
H. Alam, R. Hartono, Fuad Rahman, Y. Tarnikova, T.
Tjahjadi and C. Wilcox (BCL Technologies)
12:00 Lunch:
1:00 Government:
The Declassification Challenge: Can Technology Make a Difference?
Richard Warshaw (Central Intelligence Agency)
VACE Advanced R&D Program
John Prange (Advanced Research and Development Activity)
2.00 Panel:
Government Grand Challenges: What we need and how we get it?
|