Literatures in medical domain are often annotated with subject headings by professionals to help information seeking via manifesting the subjects of documents, where subject headings serve as the pivot language between documents and users. Current information retrieval methods using subject headings have not fully exploited the potential of subject headings yet. Both positive and negative results have been reported. In this paper, we explored the three-layer structure of documents annotated with subject headings, including document layer, concept layer (i.e. subject headings) and term layer, and then we proposed a concept-enhanced relevance model. The document-concept associations are mined to generate conceptual representations for documents and the concept-term associations are quantified and used to represent concepts as language models. By embedding these associations, subject headings are applied to enrich the document models in the estimation process of relevance model. The experiments carried out on two medical collections showed the improvements of our model by comparing with three state-of-the-art baselines. Therefore, if exploited appropriately, such manually curated annotations as subject headings can become an effective tool to enhance information retrieval.
[1]
Qing Zeng-Treitler,et al.
Research Paper: Assisting Consumer Health Information Retrieval with Query Recommendations
,
2006,
J. Am. Medical Informatics Assoc..
[2]
ChengXiang Zhai,et al.
A comparative study of methods for estimating query language models with pseudo feedback
,
2009,
CIKM.
[3]
David Hawking,et al.
Does topic metadata help with Web search?
,
2007,
J. Assoc. Inf. Sci. Technol..
[4]
Zhiyong Lu,et al.
Evaluation of query expansion using MeSH in PubMed
,
2009,
Information Retrieval.
[5]
M. de Rijke,et al.
Conceptual language models for domain-specific retrieval
,
2010,
Inf. Process. Manag..
[6]
Kun Lu,et al.
Automatically infer subject terms and documents associations through text mining
,
2013,
ASIST.
[7]
Yi Li,et al.
Exploring criteria for successful query expansion in the genomic domain
,
2009,
Information Retrieval.