Combining Deep Linguistics Analysis and Surface Pattern Learning: A Hybrid Approach to Chinese Definitional Question Answering

We explore a hybrid approach for Chinese definitional question answering by combining deep linguistic analysis with surface pattern learning. We answer four questions in this study: 1) How helpful are linguistic analysis and pattern learning? 2) What kind of questions can be answered by pattern matching? 3) How much annotation is required for a pattern-based system to achieve good performance? 4) What linguistic features are most useful? Extensive experiments are conducted on biographical questions and other definitional questions. Major findings include: 1) linguistic analysis and pattern learning are complementary; both are required to make a good definitional QA system; 2) pattern matching is very effective in answering biographical questions while less effective for other definitional questions; 3) only a small amount of annotation is required for a pattern learning system to achieve good performance on biographical questions; 4) the most useful linguistic features are copulas and appositives; relations also play an important role; only some propositions convey vital facts.

[1]  Sergey Bratus,et al.  Experiments in Multi-Modal Automatic Content Extraction , 2001, HLT.

[2]  Tat-Seng Chua,et al.  Unsupervised learning of soft patterns for generating definitions from online news , 2004, WWW '04.

[3]  Jinxi Xu,et al.  A Hybrid Approach to Answering Biographical Questions , 2004, New Directions in Question Answering.

[4]  Eric Brill,et al.  A Unified Framework For Automatic Evaluation Using 4-Gram Co-occurrence Statistics , 2004, ACL.

[5]  Martin M. Soubbotin,et al.  Use of Patterns for Detection of Likely Answer Strings: A Systematic Approach , 2002, TREC.

[6]  Richard M. Schwartz,et al.  A hidden Markov model information retrieval system , 1999, SIGIR '99.

[7]  Sanda M. Harabagiu,et al.  Answer Mining by Combining Extraction Techniques with Abductive Reasoning , 2003, Text Retrieval Conference.

[8]  Valentin Jijkoun,et al.  Information Extraction for Question Answering: Improving Recall Through Syntactic Patterns , 2004, COLING.

[9]  Jinxi Xu,et al.  Evaluation of an extraction-based approach to answering definitional questions , 2004, SIGIR '04.

[10]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[11]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[12]  Sanda M. Harabagiu,et al.  Performance issues and error analysis in an open-domain question answering system , 2003, TOIS.

[13]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[14]  Sasha Blair-Goldensohn,et al.  Answering Definitional Questions: A Hybrid Approach , 2004, New Directions in Question Answering.

[15]  Jimmy J. Lin,et al.  Automatically Evaluating Answers to Definition Questions , 2005, HLT.