March
2, 2000, 4:00 PM
Helen Meng
Human-Computer
Communications Laboratory
The Chinese University of Hong Kong
The
Use of Syllables and Words for Indexing Spoken Documents in Chinese
ABSTRACT
|
This
talk describes our initial attempt in spoken document retrieval
using the audio tracks of local television news broadcasts in
Cantonese, a major dialect of Chinese. We studied the use of syllable-based
units for audio indexing, which include base syllables and tonal
syllables as monosyllables, overlapping bi-syllables and tri-syllables.
The syllable was compared the word for audio indexing. We performed
a known-item retrieval task, using a video archive of 1801 news
stories. The stories¡¯ transcripts were mapped into
syllables by referencing our pronunciation dictionary (CUPDICT)
and lexicon (CULEX). The news domain is extremely diverse ¨C
many words or terms in the news corpus (54.5 hours) are absent
from our lexicons, which affected our retrieval results based
on text. Indexing with overlapping bi-syllables (with tone) gave
the best average inverse rank (AIR) of 0.83. The incorporation
of lexical knowledge effectively reduced the size of the index
term set while sustaining retrieval performance. We also attempted
retrieval using a speech recognition outputs. Our recognizer was
trained mostly on clean, read speech; and had little adaptation
on broadcast quality speech. Using base syllables as overlapping
bigrams, the AIR degraded to 0.46 due to recognition errors. To
bridge the gap between text-based queries and audio-based documents,
we also applied a query expansion technique, referencing the syllable
recognition confusion matrix for expansion. The technique was
found to contribute towards retrieval performance improvement.
Bio: Helen Meng obtained her S.B., S. M. and Ph.D. degrees, all
from the Massachusetts Institute of Technology, where she also
worked as aResearch Scientist at the Spoken Language Systems Group.
In 1998, she joined the Chinese University of Hong Kong, and founded
the Human-Computer Communications Laboratory (HCCL). Helen¡¯s
research interest is in the area of human-computer interaction
via spoken language systems, which integrate a plethora of speech
and language technologies, including speech recognition, natural
language understanding, discourse and dialog modeling, language
generation and speech synthesis. She is also interested in translingual
speech retrieval, and will be leading the Mandarin-English Information
(MEI) team at the John Hopkins Summer Workshop 2000.
|