Thesaurus or Logical Ontology, Which One Do We Need for Text Mining?

Ontologies are recognised as important tools, not only for effective and efficient information sharing, but also for information extraction and text mining. In the biomedical domain, the need for a common ontology for information sharing has long been recognised, and several ontologies are now widely used. However, there is confusion among researchers concerning the type of ontology that is needed for text mining , and how it can be used for effective knowledge management, sharing, and integration in biomedicine. We argue that there are several different ways to define an ontology and that, while the logical view is popular for some applications, it may be neither possible nor necessary for text mining. We propose a text-centered approach for knowledge sharing, as an alternative to formal ontologies. We argue that a thesaurus (i.e. an organised collection of terms enriched with relations) is more useful for text mining applications than formal ontologies.

[1]  Alexander A. Morgan,et al.  Gene Name Extraction Using FlyBase Resources , 2003, BioNLP@ACL.

[2]  Sophia Ananiadou,et al.  MaSTerClass: a case-based reasoning system for the classification of biomedical terms , 2005, Bioinform..

[3]  Limsoon Wong,et al.  Accomplishments and challenges in literature data mining for biology , 2002, Bioinform..

[4]  Hongfang Liu,et al.  Pacific Symposium on Biocomputing 9:238-249(2004) BIOLOGICAL NOMENCLATURES: A SOURCE OF LEXICAL KNOWLEDGE AND AMBIGUITY , 2022 .

[5]  Alan L. Rector,et al.  Terminologies and terminology servers for information environments , 1997, Proceedings Eighth IEEE International Workshop on Software Technology and Engineering Practice incorporating Computer Aided Software Engineering.

[6]  Werner Ceusters,et al.  Mistakes in medical ontologies: where do they come from and how can they be detected? , 2004, Studies in health technology and informatics.

[7]  Sophia Ananiadou,et al.  Using automatically learnt verb selectional preferences for classification of biomedical terms , 2004, J. Biomed. Informatics.

[8]  Juan C. Sager,et al.  A practical course in terminology processing , 1990 .

[9]  Carol Friedman,et al.  Introduction: named entity recognition in biomedicine , 2004, J. Biomed. Informatics.

[10]  Hideki Mima,et al.  A terminology management workbench for Molecular Biology. , 2001 .

[11]  Jin-Dong Kim,et al.  The GENIA corpus: an annotated research abstract corpus in molecular biology domain , 2002 .

[12]  Jun'ichi Tsujii,et al.  Probabilistic term variant generator for biomedical terms , 2003, SIGIR.

[13]  Russ B. Altman,et al.  Research Paper: Creating an Online Dictionary of Abbreviations from MEDLINE , 2002, J. Am. Medical Informatics Assoc..

[14]  Yuka Tateisi,et al.  Annotation of Predicate-argument Structure on Molecular Biology Text , 2004 .

[15]  Alfonso Valencia,et al.  Information extraction in molecular biology , 2002, Briefings Bioinform..

[16]  National Library of Medicine fact sheet. , 1974, Clinical toxicology.

[17]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[18]  Goran Nenadic,et al.  Mining Biomedical Abstracts: What's in a Term? , 2004, IJCNLP.

[19]  James Pustejovsky,et al.  Extraction and Disambiguation of Acronym Meaning-Pairs in Medline , 2001 .

[20]  Hideki Mima,et al.  Terminology-driven literature mining and knowledge acquisition in biomedicine , 2002, Int. J. Medical Informatics.

[21]  Evelyne Tzoukermann,et al.  NLP for Term Variant Extraction: Synergy Between Morphology, Lexicon, and Syntax , 1999 .

[22]  Hideki Mima,et al.  Design and Implementation of a Terminology-based Literature Mining and Knowledge Structuring System , 2004 .

[23]  Hideki Mima,et al.  Automatic recognition of multi-word terms:. the C-value/NC-value method , 2000, International Journal on Digital Libraries.

[24]  Hideki Mima,et al.  A Methodology for Terminology-based Knowledge Acquisition and Integration , 2002, COLING.

[25]  Gary Klein,et al.  The State of Cognitive Systems Engineering , 2002, IEEE Intell. Syst..