Finding similar questions in collaborative question answering archives: toward bootstrapping-based equivalent pattern learning

Many questions submitted to Collaborative Question Answering (CQA) sites have similar questions answered before. We propose a precise approach of automatically finding an answer to such questions by automatically identifying “equivalent” questions submitted and answered, in the past. Our method is based on automatically generating equivalent question patterns by grouping together questions that have previously obtained the same answers. The generated patterns are used as seed patterns to match more questions to extract large number of equivalent patterns by a new bootstrapping-based learning method. The resulting patterns can be applied to match a new question to an equivalent one that has already been answered, and thus suggest potential answers automatically. We experimented with this approach over a large collection of more than 200,000 real questions drawn from the Yahoo! Answers archive, automatically acquiring over 16,991 groups of equivalent question patterns. These patterns allow our method to obtain over 57% recall and over 54% precision on suggesting an answer automatically to new questions, significantly improving over baseline methods.

[1]  Mario Lenz,et al.  Question Answering with Textual CBR , 1998, FQAS.

[2]  Qingtian Zeng,et al.  Semantic patterns for user‐interactive question answering , 2008, Concurr. Comput. Pract. Exp..

[3]  Chung-Hsien Wu,et al.  Domain-specific FAQ retrieval using independent aspects , 2005, TALIP.

[4]  Noriko Tomuro,et al.  Question Terminology and Representation for Question Type Classification , 2002, COLING 2002.

[5]  Kristian J. Hammond,et al.  FAQ finder: a case-based approach to knowledge navigation , 1995, Proceedings the 11th Conference on Artificial Intelligence for Applications.

[6]  Enhong Chen,et al.  SIIPU*S: A Semantic Pattern Learning Algorithm , 2006, SKG.

[7]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[8]  Ion Muslea,et al.  Extraction Patterns for Information Extraction Tasks: A Survey , 1999 .

[9]  Eriks Sneiders,et al.  Automated Question Answering Using Question Templates That Cover the Conceptual Model of the Database , 2002, NLDB.

[10]  Qingtian Zeng,et al.  Semantic Pattern for User-Interactive Question Answering , 2006, SKG.

[11]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[12]  Leila Kosseim,et al.  Improving the performance of question answering with semantically equivalent answer patterns , 2008, Data Knowl. Eng..

[13]  Dell Zhang,et al.  Web Based Pattern Mining and Matching Approach to Question Answering , 2002, TREC.

[14]  Steven D. Whitehead,et al.  Auto-FAQ: An Experiment in Cyberspace Leveraging , 1995, Comput. Networks ISDN Syst..

[15]  W. Bruce Croft,et al.  Finding semantically similar questions based on their answers , 2005, SIGIR '05.

[16]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[17]  Iryna Gurevych,et al.  Answering Learners’ Questions by Retrieving Question Paraphrases from Social Q&A Sites , 2008 .

[18]  Eugene Agichtein,et al.  Finding the right facts in the crowd: factoid question answering over social media , 2008, WWW.

[19]  Horacio Saggion,et al.  A pattern based approach to answering factoid, list and definition questions , 2004 .

[20]  Valentin Jijkoun,et al.  Retrieving answers from frequently asked questions pages on the web , 2005, CIKM '05.

[21]  K. Minton Extraction Patterns for Information Extraction Tasks : A Survey , 1999 .

[22]  Kai Wang,et al.  A syntactic tree matching approach to finding similar questions in community-based qa services , 2009, SIGIR.