Learning to Rank Answers on Large Online QA Collections

This work describes an answer ranking engine for non-factoid questions built using a large online community-generated question-answer collection (Yahoo! Answers). We show how such collections may be used to effectively set up large supervised learning experiments. Furthermore we investigate a wide range of feature types, some exploiting NLP processors, and demonstrate that using them in combination leads to considerable improvements in accuracy.

[1]  Ryuichiro Higashinaka,et al.  Corpus-based Question Answering for why-Questions , 2008, IJCNLP.

[2]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[3]  Sanda M. Harabagiu,et al.  LASSO: A Tool for Surfing the Answer Net , 1999, TREC.

[4]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[5]  Jimmy J. Lin,et al.  Data-Intensive Question Answering , 2001, TREC.

[6]  Yi Liu,et al.  Statistical Machine Translation for Query Expansion in Answer Retrieval , 2007, ACL.

[7]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[8]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[9]  Yasemin Altun,et al.  Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger , 2006, EMNLP.

[10]  Teruko Mitamura,et al.  Language-independent Probabilistic Answer Ranking for Question Answering , 2007, ACL.

[11]  Gilad Mishne,et al.  YR-2007-005 FINDING HIGH-QUALITY CONTENT IN SOCIAL MEDIA WITH AN APPLICATION TO COMMUNITY-BASED QUESTION ANSWERING , 2007 .

[12]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[13]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[14]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[15]  Daniel Marcu,et al.  A Noisy-Channel Approach to Question Answering , 2003, ACL.

[16]  Roxana Gîrju,et al.  Automatic Detection of Causal Relations for Question Answering , 2003, ACL 2003.

[17]  Stephen E. Robertson,et al.  On relevance weights with little relevance information , 1997, SIGIR '97.

[18]  Dan Roth,et al.  Learning question classifiers: the role of semantic information , 2005, Natural Language Engineering.

[19]  Luis Gravano,et al.  Learning search engine specific query transformations for question answering , 2001, WWW '01.

[20]  Roberto Basili,et al.  Exploiting Syntactic and Shallow Semantic Kernels for Question Answer Classification , 2007, ACL.

[21]  John Blitzer,et al.  Frustratingly Hard Domain Adaptation for Dependency Parsing , 2007, EMNLP.

[22]  Bernardo Magnini,et al.  Comparing Statistical and Content-Based Techniques for Answer Validation on the Web , 2002 .

[23]  Felice Dell'Orletta,et al.  Multilingual Dependency Parsing and Domain Adaptation using DeSR , 2007, EMNLP.

[24]  Eric Brill,et al.  Automatic question answering using the web: Beyond the Factoid , 2006, Information Retrieval.

[25]  W. Bruce Croft,et al.  A framework to predict the quality of answers with non-textual features , 2006, SIGIR.

[26]  Aravind K. Joshi,et al.  Ranking and Reranking with Perceptron , 2005, Machine Learning.

[27]  James P. Callan,et al.  Structured retrieval for question answering , 2007, SIGIR.

[28]  Vassilis Plachouras,et al.  SEMANTIC ASSOCIATIONS FOR CONTEXTUAL ADVERTISING , 2008 .

[29]  Sanda M. Harabagiu,et al.  FALCON: Boosting Knowledge for Answer Engines , 2000, TREC.

[30]  Tat-Seng Chua,et al.  Question answering passage retrieval using dependency relations , 2005, SIGIR '05.