Automatic Alignment and Extraction of Bilingual Domain Ontology for Medical Domain Web Search

This paper proposes an approach to automated ontology alignment and domain ontology extraction from two knowledge bases. First, WordNet and HowNet knowledge bases are aligned to construct a bilingual universal ontology based on the co-occurrence of the words in a parallel corpus. The bilingual universal ontology has the merit that it contains more structural and semantic information coverage from two complementary knowledge bases, WordNet and HowNet. For domain-specific applications, a medical domain ontology is further extracted from the universal ontology using the islanddriven algorithm and a medical domain corpus. Finally, the domain-dependent terms and some axioms between medical terms based on a medical encyclopaedia are added into the ontology. For ontology evaluation, experiments on web search were conducted using the constructed ontology. The experimental results show that the proposed approach can automatically align and extract the domain-specific ontology. In addition, the extracted ontology also shows its promising ability for medical web search.

[1]  Wim Peters,et al.  Multilingual design of EuroWordNet , 1997, ACL 1997.

[2]  Carl Gutwin,et al.  Domain-Specific Keyphrase Extraction , 1999, IJCAI.

[3]  Paola Velardi,et al.  Integrated approach to Web ontology learning and engineering , 2002, Computer.

[4]  Hans Weigand,et al.  Experiences with a multilingual ontology-based lexicon for news filtering , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[5]  R. P. van de Riet,et al.  Applications of Natural Language to Information Systems , 1996 .

[6]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[7]  Naoki Asanoma Alignment of Ontologies: WordNet and Goi-Taikei , 2001, HTL 2001.

[8]  Raymond J. Mooney,et al.  A Mutually Beneficial Integration of Data Mining and Information Extraction , 2000, AAAI/IAAI.

[9]  Markus Junker,et al.  Learning for Text Categorization and Information Extraction with ILP , 1999, Learning Language in Logic.

[10]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[11]  Hsinchun Chen,et al.  HelpfulMed: Intelligent Searching for Medical Informationover the Internet , 2003 .

[12]  Sankar K. Pal,et al.  Web mining in soft computing framework: relevance, state of the art and future directions , 2002, IEEE Trans. Neural Networks.

[13]  VelardiPaola,et al.  Integrated Approach to Web Ontology Learning and Engineering , 2002 .

[14]  Thomas Hofmann,et al.  The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data , 1999, IJCAI.

[15]  Bin Zhu,et al.  elpfulMed: Intelligent searching for medical information over the internet , 2003, J. Assoc. Inf. Sci. Technol..

[16]  I. Hamzaoglu H. Kargupta,et al.  Distributed Data Mining Using An Agent Based Architecture , 1997, KDD 1997.

[17]  Padmini Srinivasan,et al.  Cross-language information retrieval with the UMLS metathesaurus , 1998, SIGIR '98.

[18]  Mark A. Musen,et al.  PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment , 2000, AAAI/IAAI.

[19]  Mika Klemettinen,et al.  Applying data mining techniques for descriptive phrase extraction in digital document collections , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.