Concept disambiguation for improved subject access using multiple knowledge sources

TitleConcept disambiguation for improved subject access using multiple knowledge sources
Publication TypeJournal Articles
Year of Publication2007
AuthorsSidhu T, Klavans J, Jimmy Lin
JournalProceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007)
Pagination25 - 25
Date Published2007///
Abstract

We address the problem of mining text forrelevant image metadata. Our work is situ-
ated in the art and architecture domain,
where highly specialized technical vocabu-
lary presents challenges for NLP tech-
niques. To extract high quality metadata,
the problem of word sense disambiguation
must be addressed in order to avoid leading
the searcher to the wrong image as a result
of ambiguous—and thus faulty—meta-
data. In this paper, we present a disam-
biguation algorithm that attempts to select
the correct sense of nouns in textual de-
scriptions of art objects, with respect to a
rich domain-specific thesaurus, the Art and
Architecture Thesaurus (AAT). We per-
formed a series of intrinsic evaluations us-
ing a data set of 600 subject terms ex-
tracted from an online National Gallery of
Art (NGA) collection of images and text.
Our results showed that the use of external
knowledge sources shows an improvement
over a baseline.