Proof of concept: concept-based biomedical information retrieval

In this thesis we investigate the possibility to integrate domain-specific knowledge into biomedical information retrieval (IR). Recent decades have shown a fast growing interest in biomedical research, reflected by an exponential growth in scientific literature. An important problem for biomedical IR is dealing with the complex and inconsistent terminology encountered in biomedical publications. Dealing with the terminology problem requires domain knowledge stored in terminological resources: controlled indexing vocabularies and thesauri. The integration of this knowledge is, however, far from trivial. The first research theme investigates heuristics for obtaining word-based representations from biomedical text for robust retrieval. We investigated the effect of choices in document preprocessing heuristics on retrieval effectiveness. Document preprocessing heuristics such as stop word removal, stemming, and breakpoint identification and normalization were shown to strongly affect retrieval performance. An effective combination of heuristics was identified to obtain a word-based representation from text for the remainder of this thesis. The second research theme deals with concept-based retrieval. We compared a word-based to a concept-based representation and determined to what extent a manual concept-based representation can be automatically obtained from text. Retrieval based on only concepts was demonstrated to be significantly less effective than word-based retrieval. This deteriorated performance could be explained by errors in the classification process, limitations of the concept vocabularies and limited exhaustiveness of the concept-based document representations. Retrieval based on a combination of word-based and automatically obtained concept-based query representations did significantly improve word-only retrieval. In the third and last research theme we propose a cross-lingual framework for monolingual biomedical IR. In this framework, the integration of a concept-based representation is viewed as a cross-lingual matching problem involving a word-based and concept-based representation language. This framework gives us the opportunity to adopt a large set of established crosslingual information retrieval methods and techniques for this domain. Experiments with basic term-to-term translation models demonstrate that this approach can significantly improve word-based retrieval. Directions for future work are using these concepts for communication between user and retrieval system, extending upon the translation models and extending CLIR-enhanced concept-based retrieval outside the biomedical domain. Available online from http://purl.utwente.nl/publications/72481.

[1]  William R. Hersh,et al.  Phrases, Boosting, and Query Expansion Using External Knowledge Resources for Genomic Information Retrieval , 2003, TREC.

[2]  Lai Xu Monitoring multi-party contracts for E-business , 2004 .

[3]  David G. Stork,et al.  Pattern Classification , 1973 .

[4]  Dolf Trieschnigg,et al.  The influence of basic tokenization on biomedical document retrieval , 2007, SIGIR.

[5]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[6]  Boris Shishkov,et al.  Software Specification Based on Re-usable Business Components , 2005 .

[7]  James P. Callan,et al.  Training algorithms for linear text classifiers , 1996, SIGIR '96.

[8]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[9]  Paul McNamee,et al.  Textual representations for corpus-based bilingual retrieval , 2008 .

[10]  Martijn van Otterlo,et al.  The logic of adaptive behavior : knowledge representation and algorithms for the Markov decision process framework in first-order domains , 2008 .

[11]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[12]  Jianqiang Wang,et al.  Combining bidirectional translation and synonymy for cross-language information retrieval , 2006, SIGIR.

[13]  James J. Cimino,et al.  Towards the development of a conceptual distance metric for the UMLS , 2004, J. Biomed. Informatics.

[14]  Robert R. Korfhage Query Enhancement by User Profiles , 1984, SIGIR.

[15]  Mark Sanderson,et al.  Word sense disambiguation and information retrieval , 1994, SIGIR '94.

[16]  Kui-Lam Kwok Exploiting a Chinese-English bilingual wordlist for English-Chinese cross language information retrieval , 2000, IRAL '00.

[17]  W. Teepe Reconciling Information Exchange and Confidentiality, A Formal Approach , 2007 .

[18]  S. A. Raaijmakers,et al.  Multinomial Language Learning: Investigations into the Geometry of Language , 2009 .

[19]  Neil R. Smalheiser,et al.  ADAM: another database of abbreviations in MEDLINE , 2006, Bioinform..

[20]  Davide Grossi,et al.  Designing invisible handcuffs : Formal investigations in institutions and organizations for multi-agent systems , 2007 .

[21]  Dolf Trieschnigg,et al.  Biomedical cross-language information retrieval , 2008, SIGIR '08.

[22]  Pascale Fung,et al.  Mixed Language Query Disambiguation , 1999, ACL.

[23]  P. E. Gallagher,et al.  The great contribution: Index Medicus, Index-Catalogue, and IndexCat. , 2009, Journal of the Medical Library Association : JMLA.

[24]  Jacques Savoy,et al.  Searching in Medline: Query expansion and manual indexing evaluation , 2008, Inf. Process. Manag..

[25]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[26]  Oren Kurland,et al.  Clusters, language models, and ad hoc information retrieval , 2009, TOIS.

[27]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[28]  Rinke Hoekstra,et al.  Ontology Representation - Design Patterns and Ontologies that Make Sense , 2009, Frontiers in Artificial Intelligence and Applications.

[29]  Douglas W. Oard,et al.  Probabilistic structured query methods , 2003, SIGIR.

[30]  W. Bruce Croft,et al.  Cross-lingual relevance models , 2002, SIGIR '02.

[31]  Jian-Yun Nie,et al.  Query expansion using term relationships in language models for information retrieval , 2005, CIKM '05.

[32]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[33]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[34]  W. Bruce Croft Knowledge-based and statistical approaches to text retrieval , 1993, IEEE Expert.

[35]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[36]  W. John Wilbur,et al.  Automatic MeSH term assignment and quality assessment , 2001, AMIA.

[37]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[38]  Jong-Hyeok Lee,et al.  Parsimonious translation models for information retrieval , 2007, Inf. Process. Manag..

[39]  Henk Ernst Blok Database Optimization Aspects for Information Retrieval , 2002 .

[40]  Hagit Shatkay,et al.  Mining the Biomedical Literature in the Genomic Era: An Overview , 2003, J. Comput. Biol..

[41]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[42]  Xiaohua Hu,et al.  Context-sensitive semantic smoothing for the language modeling approach to genomic IR , 2006, SIGIR.

[43]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[44]  D. M. Breuker Memory versus search in games , 1998 .

[45]  G. Jonker Efficient and Equitable Exchange in Air Traffic Management Plan Repair using Spender-signed Currency , 2008 .

[46]  H. Koning Communication of IT-Architecture , 2008 .

[47]  Michael Krauthammer,et al.  Term identification in the biomedical literature , 2004, J. Biomed. Informatics.

[48]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[49]  Gary Marchionini,et al.  Examining the effectiveness of real-time query expansion , 2007, Inf. Process. Manag..

[50]  John D. Lafferty,et al.  Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[51]  Hugo Hendrik Kielman Politiële gegevensverwerking en Privacy. Naar een effectieve waarborging , 2010 .

[52]  David McLean,et al.  Measuring Semantic Similarity Between Words Using Lexical Knowledge and Neural Networks , 2002, IDEAL.

[53]  A. Valencia,et al.  Text-mining and information-retrieval services for molecular biology , 2005, Genome Biology.

[54]  Bob Carpenter,et al.  Phrasal Queries with LingPipe and Lucene: Ad Hoc Genomics Text Retrieval , 2004, TREC.

[55]  Olga Anatoliyivna Kulyk,et al.  Do You Know What I Know? Situational Awareness of Co-located Teams in Multidisplay Environments. , 2010 .

[56]  Zhiyong Lu,et al.  Evaluation of query expansion using MeSH in PubMed , 2009, Information Retrieval.

[57]  Andreas Martin Thomas Lincke,et al.  Electronic business negotiation: some experimental studies on the interaction between medium, innovation context, and culture , 2003 .

[58]  Wouter Immánuël Koelewijn Privacy en politiegegevens. Over geautomatiseerde normatieve informatie-uitwisseling , 2009 .

[59]  Ari Pirkola,et al.  TREC 2005 Genomics Track Experiments at UTA , 2005, TREC.

[60]  Mark Sanderson,et al.  A Study of User Interaction with a Concept-Based Interactive Query Expansion Support Tool , 2004, ECIR.

[61]  William R. Hersh,et al.  Information Retrieval in Medicine: The SAPHIRE Experience , 1995, J. Am. Soc. Inf. Sci..

[62]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2010, Nucleic Acids Res..

[63]  Marek Reformat,et al.  Multilabel associative classification categorization of MEDLINE articles into MeSH keywords. , 2007, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[64]  J. Scott McCarley Should we Translate the Documents or the Queries in Cross-language Information Retrieval? , 1999, ACL.

[65]  Hendrik Drachsler,et al.  Navigation Support for Learners in Informal Learning Networks , 2009 .

[66]  Djoerd Hiemstra,et al.  Parsimonious language models for information retrieval , 2004, SIGIR '04.

[67]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[68]  K. A. McKibbon,et al.  Online access to MEDLINE in clinical settings. A study of use and usefulness. , 1990, Annals of internal medicine.

[69]  Berthier A. Ribeiro-Neto,et al.  Concept-based interactive query expansion , 2005, CIKM '05.

[70]  Stacey Fusae Nagat User Assistance for Multitasking with Interruptions on a Mobile Device , 2006 .

[71]  W. Bruce Croft,et al.  Phrasal translation and query expansion techniques for cross-language information retrieval , 1997, SIGIR '97.

[72]  Robert J. Gaizauskas,et al.  Sheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms , 2004, TREC.

[73]  Karianne Vermaas,et al.  Fast diffusion and broadening use: A research on residential adoption and usage of broadband internet in the Netherlands between 2001 and 2005 , 2007 .

[74]  Dolf Trieschnigg,et al.  Cross Language Information Retrieval for Biomedical Literature , 2007, TREC.

[75]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[76]  Hsin-Hsi Chen,et al.  Integrating Query Translation and Document Translation in a Cross-language Information Retrieval System , 1998, AMTA.

[77]  Martijn J. Schuemie,et al.  Peregrine: Lightweight gene name normalization by dictionary lookup , 2007 .

[78]  W.C.A. Wijngaards,et al.  Agent-Based Modelling of Dynamics: Biological and Organisational Applications , 2002 .

[79]  Peter Boncz,et al.  UvA-DARE ( Digital Academic Repository ) Monet ; a next-Generation DBMS Kernel For Query-Intensive Applications , 2007 .

[80]  Sung-Hyon Myaeng,et al.  Using Mutual Information to Resolve Query Translation Ambiguities and Query Term Weighting , 1999, ACL.

[81]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[82]  Ari Pirkola,et al.  TREC 2003 Genomics Track Experiments at UTA: Query Expansion with Predefinded High Frequency Terms , 2003, TREC.

[83]  Mark Dredze,et al.  TREC 2005 Genomics Track Experiments at IBM Watson , 2005, TREC.

[84]  Chris Buckley,et al.  Improving automatic query expansion , 1998, SIGIR '98.

[85]  Fabrice Camous Ontology-based document representation for biomedical information retrieval , 2007 .

[86]  Patrick Ruch,et al.  Automatic assignment of biomedical categories: toward a generic approach , 2006, Bioinform..

[87]  Hagit Shatkay,et al.  Hairpins in bookstacks: Information retrieval from biomedical text , 2005, Briefings Bioinform..

[88]  K. A. McKibbon,et al.  Online access to medline in clinical settings , 2020 .

[89]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[90]  Miller Ra,et al.  Making the conceptual connections: the Unified Medical Language System (UMLS) after a decade of research and development. , 1998 .

[91]  J.S.J.H. Penders,et al.  The practical art of moving physical objects , 1999 .

[92]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[93]  H. Pearson Biology's name game , 2001, Nature.

[94]  Ophir Frieder,et al.  IIT TREC 2006: Genomics Track , 2006, TREC.

[95]  Martijn J. Schuemie,et al.  Word Sense Disambiguation in the Biomedical Domain: An Overview , 2005, J. Comput. Biol..

[96]  R. Arendsen,et al.  Geen bericht, goed bericht : een onderzoek naar de effecten van de introductie van elektronisch berichtenverkeer met de overheid op de administratieve lasten van bedrijven , 2008 .

[97]  Rainer Malik CONAN : Text Mining in the Biomedical Domain , 2006 .

[98]  M. B. van Riemsdijk,et al.  Cognitive agent programming : A semantic approach , 2006 .

[99]  Martin Wigbertus Antonius Caminada For the sake of the Argument : explorations into argument-based reasoning , 1997 .

[100]  Joost Geurts,et al.  A document engineering model and processing framework for multimedia documents , 2010 .

[101]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[102]  James Allan,et al.  The effect of adding relevance information in a relevance feedback environment , 1994, SIGIR '94.

[103]  M. V. Dignum,et al.  A Model for Organizational Interaction: based on Agents, founded in Logic , 2000 .

[104]  M. de Rijke,et al.  Conceptual language models for domain-specific retrieval , 2010, Inf. Process. Manag..

[105]  Wessel Kraaij,et al.  Variations on language modeling for information retrieval , 2005, SIGF.

[106]  Henning Rode,et al.  From Document to Entity Retrieval: Improving Precision and Performance of Focused Text Search , 2008 .

[107]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[108]  Christiane Fellbaum,et al.  Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms , 1998 .

[109]  Bart Willem Schermer,et al.  Software Agents, Surveillance and the right to privacy , 2007 .

[110]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[111]  Susanne M. Humphrey,et al.  The NLM Indexing Initiative's Medical Text Indexer , 2004, MedInfo.

[112]  Gerard Salton,et al.  A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART) , 1972, J. Am. Soc. Inf. Sci..

[113]  Turid Hedlund,et al.  Dictionary-Based Cross-Language Information Retrieval: Problems, Methods, and Research Findings , 2001, Information Retrieval.

[114]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[115]  Cyril W. Cleverdon,et al.  Aslib Cranfield research project - Factors determining the performance of indexing systems; Volume 1, Design; Part 2, Appendices , 1966 .

[116]  Luo Si,et al.  Combining Multiple Resources, Evidences and Criteria for Genomic Information Retrieval , 2006, TREC.

[117]  Karen Spärck Jones,et al.  The use of automatically-obtained keyword classifications for information retrieval , 1969, Inf. Storage Retr..

[118]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[119]  Dolf Trieschnigg,et al.  Measuring concept relatedness using language models , 2008, SIGIR '08.

[120]  Mark Sanderson,et al.  Universities of Leeds, Sheffield and York http://eprints.whiterose.ac.uk/ , 2022 .

[121]  William R. Hersh,et al.  Information Retrieval: A Health and Biomedical Perspective , 2002 .

[122]  Luo Si,et al.  York University at TREC 2007: Genomics Track , 2005, TREC.

[123]  RetrievalDouglas W. OardCollege Alternative Approaches for Cross-Language Text Retrieval , 1997 .

[124]  Neerincx,et al.  Human-computer interaction and presence in virtual reality exposure therapy , 2003 .

[125]  Wai Lam,et al.  Automatic Text Categorization and Its Application to Text Retrieval , 1999, IEEE Trans. Knowl. Data Eng..

[126]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[127]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[128]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[129]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[130]  Yonggang Qiu Automatic query expansion based on a similarity thesaurus , 1995 .

[131]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[132]  Dolf Trieschnigg,et al.  Concept Based Document Retrieval for Genomics Literature , 2006, TREC.

[133]  Sunghwan Sohn,et al.  Research Paper: Optimal Training Sets for Bayesian Prediction of MeSH® Assignment , 2008, J. Am. Medical Informatics Assoc..

[134]  Jacob Lenting Informed gambling : conception and analysis of a multi-agent mechanism for discrete reallocation , 1999 .

[135]  Yi Li,et al.  Exploring Abbreviation Expansion for Genomic Information Retrieval , 2007, ALTA.

[136]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[137]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[138]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[139]  Yi Li,et al.  Exploring criteria for successful query expansion in the genomic domain , 2009, Information Retrieval.

[140]  P. Groot,et al.  A Theoretical and Empirical Analysis of Approximation in Symbolic Problem Solving , 2004 .

[141]  Jianfeng Gao,et al.  Statistical query translation models for cross-language information retrieval , 2006, TALIP.

[142]  Jian-Yun Nie,et al.  Adapting information retrieval to query contexts , 2008, Inf. Process. Manag..

[143]  Dennis Reidsma,et al.  Annotations and subjective machines of annotators, embodied agents, users, and other humans , 2008 .

[144]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[145]  Justin Zobel,et al.  How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[146]  Loes M. M. Braun,et al.  Pro-active medical information retrieval , 2002 .

[147]  Elin K. Jacob,et al.  Classification and Categorization: A Difference that Makes a Difference , 2004, Libr. Trends.

[148]  Stephen Tomlinson Robust, Web and Genomic Retrieval with Hummingbird SearchServer at TREC 2003 , 2003, TREC.

[149]  J. I The Design of Experiments , 1936, Nature.

[150]  Jian-Yun Nie,et al.  Using query contexts in information retrieval , 2007, SIGIR.

[151]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[152]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2006, Nucleic Acids Res..

[153]  Gregory Grefenstette,et al.  Querying across languages: a dictionary-based approach to multilingual information retrieval , 1996, SIGIR '96.

[154]  Marco Kalz,et al.  Placement Support for Learners in Learning Networks , 2006 .

[155]  R A Miller,et al.  Making the conceptual connections: the Unified Medical Language System (UMLS) after a decade of research and development. , 1998, Journal of the American Medical Informatics Association : JAMIA.

[156]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[157]  Robert Krovetz,et al.  Homonymy and Polysemy in Information Retrieval , 1997, ACL.

[158]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[159]  Ans A. G. Steuten A contribution to the linguistic analysis of business conversations within the language/action perspective , 1998 .

[160]  Goran Nenadic,et al.  Mining Biomedical Abstracts: What's in a Term? , 2004, IJCNLP.

[161]  Wai Lam,et al.  Using a generalized instance set for automatic text categorization , 1998, SIGIR '98.

[162]  Hongfang Liu,et al.  Gene name ambiguity of eukaryotic nomenclatures , 2005, Bioinform..

[163]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval , 2008, NAACL.

[164]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[165]  Hongfang Liu,et al.  Pacific Symposium on Biocomputing 9:238-249(2004) BIOLOGICAL NOMENCLATURES: A SOURCE OF LEXICAL KNOWLEDGE AND AMBIGUITY , 2022 .

[166]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[167]  Clement T. Yu,et al.  TREC Genomics Track at UIC , 2007, TREC.

[168]  William R. Hersh,et al.  Research Paper: A Performance and Failure Analysis of SAPHIRE with a MEDLINE Test Collection , 1994, J. Am. Medical Informatics Assoc..

[169]  Charles L. A. Clarke,et al.  Domain-Specific Synonym Expansion and Validation for Biomedical Information Retrieval (MultiText Experiments for TREC 2004) , 2004, TREC.

[170]  Stefan Manegold,et al.  Understanding, modeling, and improving main-memory database performance , 2002 .

[171]  F. P. Terpstra,et al.  Scientific workflow design : theoretical and practical issues , 2008 .

[172]  Kevin C. Dorff,et al.  Twease at TREC 2006: Breaking and Fixing BM25 Scoring With Query Expansion, A Biologically Inspired Double Mutant Recovery Experiment , 2006, TREC.

[173]  Ari Pirkola,et al.  The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.

[174]  Djoerd Hiemstra,et al.  Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002 , 2003, SIGF.

[175]  A. J. Lehmann Causation in artificial intelligence and law : a modelling approach , 2003 .

[176]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[177]  Susan T. Dumais,et al.  The Vocabulary Problem in Human-System Communication: an Analysis and a Solution , 1987 .

[178]  Djoerd Hiemstra,et al.  A cross-lingual framework for monolingual biomedical information retrieval , 2010, CIKM.

[179]  H Hongjing Wu,et al.  A reference architecture for adaptive hypermedia applications , 2002 .

[180]  Rik Eshuis,et al.  Semantics and Verification of UML Activity Diagrams for Workflow Modelling , 2002 .

[181]  Betsy L. Humphreys,et al.  Relationships in Medical Subject Headings (MeSH) , 2001 .

[182]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[183]  C. van Nimwegen,et al.  The paradox of the guided user: assistance can be counter-effective , 2008 .

[184]  Djoerd Hiemstra,et al.  The Importance of Prior Probabilities for Entry Page Search , 2002, SIGIR '02.

[185]  K. Cohen,et al.  Overview of BioCreative II gene normalization , 2008, Genome Biology.

[186]  Mark Sanderson,et al.  Improving cross language retrieval with triangulated translation , 2001, SIGIR '01.

[187]  Wessel Kraaij,et al.  MeSH Based Feedback, Concept Recognition and Stacked Classification for Curation Tasks , 2004, TREC.

[188]  Stefan Visscher,et al.  Bayesian network models for the management of ventilator-associated pneumonia , 2008 .

[189]  Dina Demner-Fushman,et al.  Application of Information Technology: Essie: A Concept-based Search Engine for Structured Biomedical Text , 2007, J. Am. Medical Informatics Assoc..

[190]  William R. Hersh,et al.  TREC GENOMICS Track Overview , 2003, TREC.

[191]  H.H.L.M. Donkers,et al.  NOSCE HOSTEM: Searching with Opponent Models , 1997 .

[192]  Marti A. Hearst,et al.  TREC 2004 Genomics Track Overview , 2005, TREC.

[193]  ChengXiang Zhai,et al.  An empirical study of tokenization strategies for biomedical information retrieval , 2007, Information Retrieval.

[194]  Cyril W. Cleverdon,et al.  Factors determining the performance of indexing systems , 1966 .

[195]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[196]  Mohand Boughanem,et al.  Investigation on Disambiguation in CLIR: Aligned Corpus and Bi-directional Translation-Based Strategies , 2001, CLEF.

[197]  John D. Lafferty,et al.  A risk minimization framework for information retrieval , 2006, Inf. Process. Manag..

[198]  Cyril Cleverdon,et al.  The Cranfield tests on index language devices , 1997 .

[199]  Howard L. Bleich,et al.  Technical Milestone: Medical Subject Headings Used to Search the Biomedical Literature , 2001, J. Am. Medical Informatics Assoc..

[200]  C. J. van Rijsbergen,et al.  Report on the need for and provision of an 'ideal' information retrieval test collection , 1975 .

[201]  Richard M. Schwartz,et al.  A hidden Markov model information retrieval system , 1999, SIGIR '99.

[202]  A T McCray,et al.  The Nature of Lexical Knowledge , 1998, Methods of Information in Medicine.

[203]  Padmini Srinivasan,et al.  Hierarchical Text Categorization Using Neural Networks , 2004, Information Retrieval.

[204]  R. V. D. Pol Knowledge-based query formulation in information retrieval , 2000 .

[205]  L.J.P. van der Maaten Feature extraction from visual data , 2009 .

[206]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[207]  W. Bruce Croft,et al.  Resolving ambiguity for cross-language retrieval , 1998, SIGIR '98.

[208]  Clement T. Yu,et al.  Knowledge-intensive conceptual retrieval and passage extraction of biomedical literature , 2007, SIGIR.

[209]  F. W. Lancaster,et al.  MEDLARS: Report on the Evaluation of Its Operating Efficiency. , 1997 .

[210]  Yi Liu,et al.  A maximum coherence model for dictionary-based cross-language information retrieval , 2005, SIGIR '05.

[211]  Martijn J. Schuemie,et al.  Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification , 2007, J. Biomed. Informatics.

[212]  Xiao-Ying Liu,et al.  Measuring Semantic Similarity in Wordnet , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[213]  Juan Roberto Castelo Valdueza,et al.  The Discrete Acyclic Digraph Markov Model in Data Mining , 2002 .

[214]  Padmini Srinivasan,et al.  Optimal Document-Indexing Vocabulary for MEDLINE , 1996, Inf. Process. Manag..

[215]  Hans Peter Luhn,et al.  A Statistical Approach to Mechanized Encoding and Searching of Literary Information , 1957, IBM J. Res. Dev..

[216]  Hubert Vogten,et al.  Design and Implementation Strategies for IMS Learning Design , 2008 .

[217]  P.A.T. van Eck,et al.  A Compositional Semantic Structure for Multi-Agent Systems Dynamics , 2001 .

[218]  Hisham Al-Mubaid,et al.  New ontology-based semantic similarity measure for the biomedical domain , 2006, 2006 IEEE International Conference on Granular Computing.

[219]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[220]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[221]  H. R. Stol A framework for evidence-based policy making using IT : A systems approach , 2009 .

[222]  Jimmy J. Lin,et al.  PubMed related articles: a probabilistic topic-based model for content similarity , 2007, BMC Bioinformatics.

[223]  Fernando Luiz Koch,et al.  An Agent-Based Model for the Development of Intelligent Mobile Services , 2009 .

[224]  Thomas C. Rindflesch,et al.  Query Expansion Using the UMLS ® Metathesaurus ® , 1997 .

[225]  Jianqiang Wang,et al.  User-assisted query translation for interactive cross-language information retrieval , 2008, Inf. Process. Manag..

[226]  William R. Hersh,et al.  Assessing thesaurus-based query expansion using the UMLS Metathesaurus , 2000, AMIA.

[227]  J. Verbeek Politie en de nieuwe internationale informatiemarkt : grensregionale politiele gegevensuitwisseling en digitale expertise , 2004 .

[228]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[229]  David M. W. Powers,et al.  Measuring Semantic Similarity in the Taxonomy of WordNet , 2005, ACSC.

[230]  C. Pierik,et al.  Validation techniques for object-oriented proof outlines , 2006 .

[231]  A. R. van Ballegooij,et al.  RAM: Array Database Management through Relational Mapping , 2009 .

[232]  Koen V. Hindriks,et al.  Agent programming languages: programming with mental models , 2001 .

[233]  Djoerd Hiemstra,et al.  Twenty-One at TREC7: Ad-hoc and Cross-Language Track , 1998, TREC.

[234]  Martijn J. Schuemie,et al.  Thesaurus-based disambiguation of gene symbols , 2005, BMC Bioinformatics.

[235]  Ulrich Thiel,et al.  Language Modeling for Effective Construction of Domain Specific Thesauri , 2004, NLDB.

[236]  John D. Lafferty,et al.  Information Retrieval as Statistical Translation , 2017 .

[237]  C.M.T. Metselaar,et al.  Sociaal-organisatorische gevolgen van kennistechnologie : een procesbenadering en actorperspectief , 2000 .

[238]  Katalin Boer-Sorban Agent-Based Simulation of Financial Markets: A Modular, Continuous-time Approach , 2008 .

[239]  Miguel E. Ruiz Experiments on Genomics Ad Hoc Retrieval , 2005, TREC.

[240]  L. H. Christoph The role of metacognitive skills in learning to solve problems , 2006 .

[241]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[242]  Padmini Srinivasan,et al.  Query Expansion and MEDLINE , 1996, Inf. Process. Manag..

[243]  van Arthur H. Bunningen Context-aware querying : better answers with less effort , 2008 .

[244]  Miguel E. Ruiz,et al.  CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation , 1999, TREC.

[245]  E. G. Boltjes,et al.  Voorbeeldig onderwijs : voorbeeldgestuurd onderwijs, een opstap naar abstract denken, vooral voor meisjes , 2004 .

[246]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[247]  Marti A. Hearst Untangling Text Data Mining , 1999, ACL.

[248]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[249]  Michael E. Lesk Recycling Information: Science Through Data Mining , 2008, Int. J. Digit. Curation.

[250]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .