论文信息 - Passage relevance models for genomics search

Passage relevance models for genomics search

We present a passage relevance model for integrating semantic and statistical evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.

Ophir Frieder | Nazli Goharian | Jay Urbain

[1] W. Bruce Croft,et al. The use of phrases and structured queries in information retrieval , 1991, SIGIR '91.

[2] S. Robertson. The probability ranking principle in IR , 1997 .

[3] Ophir Frieder,et al. Integrating structured data and text: a relational approach , 1997 .

[4] James P. Callan,et al. Passage-level evidence in document retrieval , 1994, SIGIR '94.

[5] M. F. Porter,et al. An algorithm for suffix stripping , 1997 .

[6] L. Azzopardi,et al. Topic based language models for ad hoc information retrieval , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[7] Jimmy J. Lin,et al. Quantitative evaluation of passage retrieval algorithms for question answering , 2003, SIGIR.

[8] W. Bruce Croft,et al. LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[9] W. Bruce Croft,et al. A Markov random field model for term dependencies , 2005, SIGIR '05.

[10] Justin Zobel,et al. Passage retrieval revisited , 1997, SIGIR '97.

[11] David Yarowsky,et al. Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.