Expanding Gene-Based PubMed Queries

The rapid expansion of the scientific literature is turning the task of finding relevant articles into a demanding one for researchers working in the biomedical field. We investigate the use of a query expansion strategy based on a thesaurus built from standard resources such as the Entrez Gene, UniProt and KEGG databases. Results obtained on the ad-hoc retrieval task of the TREC 2004 Genomics track show that query expansion improves retrieval performance on gene-centered queries. An overall mean average precision of 0.4504 was obtained, which corresponds to an increase of 96% over the use of PubMed as the retrieval engine.

[1]  Barend Mons,et al.  Online tools to support literature-based discovery in the life sciences , 2005, Briefings Bioinform..

[2]  A. Valencia,et al.  Linking genes to literature: text mining, information extraction, and retrieval applications for biology , 2008, Genome Biology.

[3]  P. Bork,et al.  Literature mining for the biologist: from information retrieval to biological discovery , 2006, Nature Reviews Genetics.

[4]  José Luís Oliveira,et al.  Improving Literature Searches in Gene Expression Studies , 2008, IWPACBB.

[5]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[6]  Zhiyong Lu,et al.  Evaluation of query expansion using MeSH in PubMed , 2009, Information Retrieval.

[7]  Yue Lu,et al.  An empirical study of gene synonym query expansion in biomedical information retrieval , 2008, Information Retrieval.

[8]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[9]  D. Rebholz-Schuhmann,et al.  Facts from Text—Is Text Mining Ready to Deliver? , 2005, PLoS biology.

[10]  Hongfang Liu,et al.  Gene name ambiguity of eukaryotic nomenclatures , 2005, Bioinform..

[11]  Toshihisa Takagi,et al.  Gene/Protein/Family Name Recognition in Biomedical Literature , 2004, HLT-NAACL 2004.

[12]  Yi Li,et al.  Exploring criteria for successful query expansion in the genomic domain , 2009, Information Retrieval.

[13]  L. Grivell,et al.  Text mining for biology - the way forward: opinions from leading scientists , 2008, Genome Biology.

[14]  William R Hersh,et al.  Enhancing access to the Bibliome: the TREC 2004 Genomics Track , 2006, Journal of biomedical discovery and collaboration.