Evaluation of biomedical text-mining systems: Lessons learned from information retrieval

Biomedical text-mining systems have great promise for improving the efficiency and productivity of biomedical researchers. However, such systems are still not in routine use. One impediment to their development is the lack of systematic and rigorous evaluation, comparable to the approaches developed for information retrieval systems. The developers of text-mining systems need to improve both test collections for system-oriented evaluation and undertake user-oriented evaluations to determine the most effective use of their systems for their intended audience.

[1]  Christof N. Schubert,et al.  Information Retrieval Today , 1963 .

[2]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[3]  Karen Sparck Jones Information Retrieval Experiment , 1971 .

[4]  R. Light Measures of response agreement for qualitative data: Some generalizations and alternatives. , 1971 .

[5]  Tefko Saracevic,et al.  RELEVANCE: A review of and a framework for the thinking on the notion in information science , 1997, J. Am. Soc. Inf. Sci..

[6]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[7]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[8]  K. A. McKibbon,et al.  Online access to medline in clinical settings , 2020 .

[9]  Michael B. Eisenberg,et al.  A re-examination of relevance: toward a dynamic, situational definition , 1990, Inf. Process. Manag..

[10]  B. Littenberg Technology assessment in medicine , 1992, Academic medicine : journal of the Association of American Medical Colleges.

[11]  W. Hersh,et al.  Use of a multi-application computer workstation in a clinical setting. , 1994, Bulletin of the Medical Library Association.

[12]  Reed M. Gardner,et al.  White Paper: Designing Medical Informatics Research and Library-Resource Projects to Increase What Is Learned , 1994, J. Am. Medical Informatics Assoc..

[13]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[14]  David D. Lewis,et al.  Evaluating and optimizing autonomous text classification systems , 1995, SIGIR '95.

[15]  Charles P. Friedman,et al.  Evaluation Methods in Medical Informatics , 1997, Computers and Medicine.

[16]  Donna K. Harman,et al.  Overview of the Sixth Text REtrieval Conference (TREC-6) , 1997, Inf. Process. Manag..

[17]  S. Satya‐Murti Evidence-based Medicine: How to Practice and Teach EBM , 1997 .

[18]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[19]  Nancy Chinchor,et al.  Overview of MUC-7 , 1998, MUC.

[20]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[21]  Justin Zobel,et al.  How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[22]  Ellen M. Voorhees,et al.  Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[23]  W R Hersh,et al.  How well do physicians use electronic information retrieval systems? A framework for investigation and systematic review. , 1998, JAMA.

[24]  Stephen E. Robertson,et al.  The TREC-8 Filtering Track Final Report , 1999, TREC.

[25]  Ellen M. Voorhees,et al.  Building a question answering test collection , 2000, SIGIR '00.

[26]  Andrew Turpin,et al.  Challenging conventional assumptions of automated information retrieval with real users: Boolean searching and batch retrieval evaluations , 2001, Inf. Process. Manag..

[27]  Ellen M. Voorhees,et al.  The Ninth Text REtrieval Conference (TREC-9) , 2001 .

[28]  Carole A. Goble,et al.  A classification of tasks in bioinformatics , 2001, Bioinform..

[29]  Paul Over,et al.  Interactivity at the Text Retrieval Conference (TREC) , 2001, Inf. Process. Manag..

[30]  Andrew Turpin,et al.  Why batch and user evaluations do not give the same results , 2001, SIGIR '01.

[31]  Stephen E. Robertson,et al.  The TREC 2002 Filtering Track Report , 2002, TREC.

[32]  J. Karlawish,et al.  The continuing unethical conduct of underpowered clinical trials. , 2002, JAMA.

[33]  Limsoon Wong,et al.  Accomplishments and challenges in literature data mining for biology , 2002, Bioinform..

[34]  Lorraine K. Tanabe,et al.  Tagging gene and protein names in biomedical text , 2002, Bioinform..

[35]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[36]  Alexander A. Morgan,et al.  Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup , 2003, ISMB.

[37]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[38]  Anton Yuryev,et al.  Extracting human protein interactions from MEDLINE using a full-sentence parser , 2004, Bioinform..

[39]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[40]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[41]  Ellen M. Voorhees,et al.  The Twelfth Text Retrieval Conference, TREC 2003 , 2004 .

[42]  Hsin-Hsi Chen,et al.  Enhancing performance of protein and gene name recognizers with filtering and integration strategies , 2004, J. Biomed. Informatics.

[43]  Nigel Collier,et al.  Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications , 2004 .

[44]  Dat Tran,et al.  Applying Task Analysis to Describe and Facilitate Bioinformatics Tasks , 2004, MedInfo.

[45]  Alfonso Valencia,et al.  Overview of BioCreAtIvE: critical assessment of information extraction for biology , 2005, BMC Bioinformatics.

[46]  Alexander A. Morgan,et al.  Data preparation and interannotator agreement: BioCreAtIvE Task 1B , 2005, BMC Bioinformatics.

[47]  William R. Hersh,et al.  A Survey of Current Work in Biomedical Text Mining , 2005 .

[48]  D. Rebholz-Schuhmann,et al.  Facts from Text—Is Text Mining Ready to Deliver? , 2005, PLoS biology.

[49]  Alexander A. Morgan,et al.  BioCreAtIvE Task 1A: gene mention finding evaluation , 2005, BMC Bioinformatics.

[50]  Alexander A. Morgan,et al.  Overview of BioCreAtIvE task 1B: normalized gene lists , 2005, BMC Bioinformatics.

[51]  Elaine Toms,et al.  Developing a protocol for bioinformatics analysis: An integrated information behavior and task analysis approach , 2005, J. Assoc. Inf. Sci. Technol..