UMIACS Computational Linguistics Colloquium Series, November 6, 1997

UMIACS Computational Linguistics Colloquium Series, November 6, 1997


Analysis for Generation of Coherent Summaries of On-Line Documents


Dr. Judith L. Klavans
Columbia University

Given the exponential growth of online information, one of the primary difficulties facing Internet users is information overload. Summaries can function as an abbreviated form of a document or as an aid in assessing the relevance of a document to a selected topic, thereby reducing the amount of information a user must read.

This talk will report on research results in a project which uses a multi-level process of summarization for the presentation of available information during browsing or searching. The unique aspects of our research are: the integration of knowledge about a document derived with both statistical and symbolic analysis techniques; the use of language generation for reformulating this information into a concise and coherent summary; and summarization across sets of related articles. Unlike other approaches which use purely statistical techniques to extract existing sentences from documents and present them as a ``summary'', our summaries are coherent and highly readable.

The primary focus of this presentation will be on the document analysis component of the project, including segmentation into topics, the identification of key content within each section, and the construction of a lexical semantic representation for use in the summary generator.

This research is joint with Professor Kathleen McKeown of the Columbia University Department of Computer Science, and is supported by the National Science Foundation in the Speech, Text, Image and Multimedia Advanced Technology Effort (STIMULATE) program, grant number IRI-9618797.


Return to the UMD Computational Linguistics Colloquium Series.