Discriminative Information Retrieval for Question Answering Sentence Selection

We propose a framework for discriminative IR atop linguistic features, trained to improve the recall of answer candidate passage retrieval, the initial step in text-based question answering. We formalize this as an instance of linear feature-based IR, demonstrating a 34%-43% improvement in recall for candidate triage for QA.

[1]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[2]  Peter Clark,et al.  Automatic Coupling of Answer Extraction and Information Retrieval , 2013, ACL.

[3]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[4]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[5]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[6]  Lei Yu,et al.  Deep Learning for Answer Sentence Selection , 2014, ArXiv.

[7]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[8]  James P. Callan,et al.  Structured retrieval for question answering , 2007, SIGIR.

[9]  Fredric C. Gey,et al.  Inferring probability of relevance using the method of logistic regression , 1994, SIGIR '94.

[10]  Noah A. Smith,et al.  Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions , 2010, NAACL.

[11]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[12]  Christopher D. Manning,et al.  Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering , 2010, COLING.

[13]  Ming-Wei Chang,et al.  Question Answering Using Enhanced Lexical Semantic Models , 2013, ACL.

[14]  W. Bruce Croft,et al.  Linear feature-based models for information retrieval , 2007, Information Retrieval.

[15]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[16]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[17]  Chris Callison-Burch,et al.  Answer Extraction as Sequence Tagging with Tree Edit Distance , 2013, NAACL.

[18]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[19]  Fredric C. Gey,et al.  Probabilistic retrieval based on staged logistic regression , 1992, SIGIR '92.

[20]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[21]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..