LAMP - Language Group

LAMP Seminar
Language and Media Processing Laboratory
Conference Room 4406
A.V. Williams Building
University of Maryland

March 2, 2000, 4:00 PM
Helen Meng

Human-Computer Communications Laboratory
The Chinese University of Hong Kong

The Use of Syllables and Words for Indexing Spoken Documents in Chinese

ABSTRACT

This talk describes our initial attempt in spoken document retrieval using the audio tracks of local television news broadcasts in Cantonese, a major dialect of Chinese. We studied the use of syllable-based units for audio indexing, which include base syllables and tonal syllables as monosyllables, overlapping bi-syllables and tri-syllables. The syllable was compared the word for audio indexing. We performed a known-item retrieval task, using a video archive of 1801 news stories. The stories¡¯ transcripts were mapped into syllables by referencing our pronunciation dictionary (CUPDICT) and lexicon (CULEX). The news domain is extremely diverse ¨C many words or terms in the news corpus (54.5 hours) are absent from our lexicons, which affected our retrieval results based on text. Indexing with overlapping bi-syllables (with tone) gave the best average inverse rank (AIR) of 0.83. The incorporation of lexical knowledge effectively reduced the size of the index term set while sustaining retrieval performance. We also attempted retrieval using a speech recognition outputs. Our recognizer was trained mostly on clean, read speech; and had little adaptation on broadcast quality speech. Using base syllables as overlapping bigrams, the AIR degraded to 0.46 due to recognition errors. To bridge the gap between text-based queries and audio-based documents, we also applied a query expansion technique, referencing the syllable recognition confusion matrix for expansion. The technique was found to contribute towards retrieval performance improvement. Bio: Helen Meng obtained her S.B., S. M. and Ph.D. degrees, all from the Massachusetts Institute of Technology, where she also worked as aResearch Scientist at the Spoken Language Systems Group. In 1998, she joined the Chinese University of Hong Kong, and founded the Human-Computer Communications Laboratory (HCCL). Helen¡¯s research interest is in the area of human-computer interaction via spoken language systems, which integrate a plethora of speech and language technologies, including speech recognition, natural language understanding, discourse and dialog modeling, language generation and speech synthesis. She is also interested in translingual speech retrieval, and will be leading the Mandarin-English Information (MEI) team at the John Hopkins Summer Workshop 2000.