Low-Bias Extraction of Domain-Specific Concepts

The availability of domain-specific knowledge models in various forms has led to the development of several tools and applications specialized on complex domains such as bio-medecine, tourism and chemistry. Yet, most of the current approaches to the extraction of domain-specific knowledge from text are limited in their portability to other domains and languages. In this paper, we present and evaluate an approach to the low-bias extraction of domain-specific concepts. Our approach is based on graph clustering and makes no use of a-priori knowledge about the language or the domain to process. Therefore, it can be used on virtually any language. The evaluation is carried out on two data sets of different cleanness and size.

[1]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[2]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[3]  Experiments on linguistically based term associations , 1991, RIAO.

[4]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[5]  J. Dopazo,et al.  Phylogenetic Reconstruction Using an Unsupervised Growing Neural Network That Adopts the Topology of a Phylogenetic Tree , 1997, Journal of Molecular Evolution.

[6]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[7]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[8]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[9]  Christian Wolff,et al.  Learning Relations Using Collocations , 2001, Workshop on Ontology Learning.

[10]  Borys Omelayenko,et al.  Learning of Ontologies from the Web: the Analysis of Existent Approaches , 2001, WebDyn@ICDT.

[11]  Stephen E. Robertson,et al.  The TREC 2002 Filtering Track Report , 2002, TREC.

[12]  Feng Luo,et al.  Ontology construction for information selection , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[13]  Christian Biemann,et al.  Ontology Learning from Text: A Survey of Methods , 2005, LDV Forum.

[14]  Sophia Ananiadou,et al.  Text Mining for Biology And Biomedicine , 2005 .

[15]  Lina Zhou,et al.  Ontology learning: state of the art and open issues , 2007, Inf. Technol. Manag..

[16]  Axel-Cyrille Ngonga Ngomo,et al.  Knowledge-free discovery of domain-specific multiword units , 2008, SAC '08.

[17]  Axel-Cyrille Ngonga Ngomo,et al.  SIGNUM: A Graph Algorithm for Terminology Extraction , 2008, CICLing.