An Approach to Automatic Construction of a Hierarchical Subject Domain for Question Answering Systems

We propose a new statistical algorithm for automatic construction of subject domains that can be used in e-mail or Web question answering systems and ontology generating. The domain hierarchy is extracted from electronic texts written in a natural language, e.g., in English. During the text processing, the quality and quantity of information presented in the texts are being evaluated and then the hierarchical relationships between the pieces of texts are established depending on the derived data. Using this approach, we have created a question answering system which executes hierarchy navigation based on a query analysis including evaluation of the user’s conversance with the subject domain. In combination, these steps result in comprehensive and non-redundant answers.

[1]  Jimmy J. Lin The Web as a Resource for Question Answering: Perspectives and Challenges , 2002, LREC.

[2]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[3]  Ivan G. Popov,et al.  Approach to Development of a System for Speech Interaction with an Intelligent Robot , 1999, Ershov Memorial Conference.

[4]  Mark A. Girolami,et al.  A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections , 2004, Journal of Intelligent Information Systems.

[5]  Anna Fensel,et al.  Classification of Email Queries by Topic: Approach Based on Hierarchically Structured Subject Domain , 2002, IDEAL.

[6]  R. P. van de Riet,et al.  Applications of Natural Language to Information Systems: Proceedings of the Second International Workshop June 26-28, 1996, Amsterdam, the Netherlands , 1996 .

[7]  Manfred Broy,et al.  Perspectives of System Informatics , 2001, Lecture Notes in Computer Science.

[8]  Zohra Bellahsene,et al.  Advances in Object-Oriented Information Systems , 2002, Lecture Notes in Computer Science.

[9]  Hujun Yin,et al.  Self-Organising Maps for Hierarchical Tree View Document Clustering Using Contextual Information , 2002, IDEAL.

[10]  Håkan Sundblad Automatic Acquisition of Hyponyms and Meronyms from Question Corpora , 2002 .

[11]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[12]  M. Kasiran,et al.  An information framework for a merchant trust agent in electronic commerce , 2002 .

[13]  Hernán Astudillo,et al.  Automatic Generation of Hierarchical Taxonomies from Free Text Using Linguistic Algorithms , 2002, OOIS Workshops.