Domain-specific FAQ retrieval using independent aspects

This investigation presents an approach to domain-specific FAQ (frequently-asked question) retrieval using independent aspects. The data analysis classifies the questions in the collected QA (question-answer) pairs into ten question types in accordance with question stems. The answers in the QA pairs are then paragraphed and clustered using latent semantic analysis and the K-means algorithm. For semantic representation of the aspects, a domain-specific ontology is constructed based on WordNet and HowNet. A probabilistic mixture model is then used to interpret the query and QA pairs based on independent aspects; hence the retrieval process can be viewed as the maximum likelihood estimation problem. The expectation-maximization (EM) algorithm is employed to estimate the optimal mixing weights in the probabilistic mixture model. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in medical FAQ retrieval.

[1]  Noriko Tomuro,et al.  Question Terminology and Representation for Question Type Classification , 2002, COLING 2002.

[2]  Charles L. A. Clarke,et al.  Statistical Selection of Exact Answers (MultiText Experiments for TREC 2002) , 2002, TREC.

[3]  Sanda M. Harabagiu,et al.  Performance issues and error analysis in an open-domain question answering system , 2003, TOIS.

[4]  Padmini Srinivasan,et al.  Cross-language information retrieval with the UMLS metathesaurus , 1998, SIGIR '98.

[5]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[6]  Georg Lausen,et al.  Ontology-Based Querying of Linked XML Documents , 2002, Semantic Web Workshop.

[7]  Fred Popowich,et al.  Adapting a synonym database to specific domains , 2000 .

[8]  Jong-Hyeok Lee,et al.  Question Answering Approach Using a WordNet-based Answer Type Taxonomy , 2002, TREC.

[9]  Susan T. Dumais,et al.  An Analysis of the AskMSR Question-Answering System , 2002, EMNLP.

[10]  Gary Geunbae Lee,et al.  Use of Dynamic Passage Selection and Lexico-Semantic Patterns for Japanese Natural Language Question Answering , 2003 .

[11]  N. H. Beebe A Complete Bibliography of ACM Transactions on Asian Language Information Processing , 2007 .

[12]  Steven D. Whitehead,et al.  Auto-FAQ: An Experiment in Cyberspace Leveraging , 1995, Comput. Networks ISDN Syst..

[13]  Eriks Sneiders,et al.  Automated FAQ Answering: Continued Experience with Shallow Language Understanding , 1999 .

[14]  Ganesh Ramakrishnan,et al.  Passage Scoring for Question Answering via Bayesian Inference on Lexical Relations , 2003, TREC.

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Jennifer Chu-Carroll,et al.  A Multi-Strategy and Multi-Source Approach to Question Answering , 2002, TREC.

[17]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[18]  Dragomir R. Radev,et al.  Mining the web for answers to natural language questions , 2001, CIKM '01.

[19]  Kristian J. Hammond,et al.  FAQ finder: a case-based approach to knowledge navigation , 1995, Proceedings the 11th Conference on Artificial Intelligence for Applications.

[20]  Charles L. A. Clarke,et al.  Exploiting redundancy in question answering , 2001, SIGIR '01.

[21]  Kevin Crowston,et al.  The effects of linking on genres of Web documents , 1999, Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers.

[22]  Patrick Pantel,et al.  Discovery of inference rules for question-answering , 2001, Natural Language Engineering.

[23]  Sanda M. Harabagiu,et al.  Performance Issues and Error Analysis in an Open-Domain Question Answering System , 2002, ACL.

[24]  Eriks Sneiders,et al.  Automated Question Answering Using Question Templates That Cover the Conceptual Model of the Database , 2002, NLDB.

[25]  Noriko Tomuro Question terminology and representation for question type classification , 2002, COLING 2002.

[26]  Yoram Singer,et al.  Boosting and Rocchio applied to text filtering , 1998, SIGIR '98.

[27]  John Quackenbush,et al.  Knowledge-Based Access to the Bio-Medical Literature, Ontologically-Grounded Experiments for the TREC 2003 Genomics Track , 2003, TREC.

[28]  Mario Lenz,et al.  Question Answering with Textual CBR , 1998, FQAS.

[29]  Wei-Kuan Shih,et al.  Semantic search on Internet tabular information extraction for answering queries , 2000, CIKM '00.

[30]  Chung-Hsien Wu,et al.  Automated Alignment and Extraction of a Bilingual Ontology for Cross-Language Domain-Specific Applications , 2004, Int. J. Comput. Linguistics Chin. Lang. Process..

[31]  Kristian J. Hammond,et al.  Question Answering from Frequently Asked Question Files: Experiences with the FAQ FINDER System , 1997, AI Mag..

[32]  Cathy H. Wu,et al.  Two-stage story segmentation and detection on broadcast news using genetic algorithm , 2003 .