Answer Extraction by Recursive Parse Tree Descent

We develop a recursive neural network (RNN) to extract answers to arbitrary natural language questions from supporting sentences, by training on a crowdsourced data set (to be released upon presentation). The RNN defines feature representations at every node of the parse trees of questions and supporting sentences, when applied recursively, starting with token vectors from a neural probabilistic language model. In contrast to prior work, we fix neither the types of the questions nor the forms of the answers; the system classifies tokens to match a substring chosen by the question’s author. Our classifier decides to follow each parse tree node of a support sentence or not, by classifying its RNN embedding together with those of its siblings and the root node of the question, until reaching the tokens it selects as the answer. A novel co-training task for the RNN, on subtree recognition, boosts performance, along with a scheme to consistently handle words that are not well-represented in the language model. On our data set, we surpass an open source system epitomizing a classic “pattern bootstrapping” approach to question answering.

[1]  Giorgio Satta,et al.  Guided Learning for Bidirectional Sequence Classification , 2007, ACL.

[2]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[3]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[4]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[5]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[6]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[7]  Luo Si,et al.  A probabilistic graphical model for joint answer ranking in question answering , 2007, SIGIR.

[8]  Léon Bottou,et al.  Stochastic Learning , 2003, Advanced Lectures on Machine Learning.

[9]  Ellen M. Voorhees,et al.  Overview of the TREC-9 Question Answering Track , 2000, TREC.

[10]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[11]  Dan Roth,et al.  Generalized Inference with Multiple Semantic Role Labeling Systems , 2005, CoNLL.

[12]  Daniel Jurafsky,et al.  Semantic Role Chunking Combining Complementary Syntactic Views , 2005, CoNLL.

[13]  Bogdan Sacaleanu,et al.  Overview of the CLEF 2008 Multilingual Question Answering Track , 2008, CLEF.

[14]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[15]  Hoifung Poon,et al.  Unsupervised Semantic Parsing , 2009, EMNLP.

[16]  Jimmy J. Lin,et al.  Overview of the TREC 2007 Question Answering Track , 2008, TREC.

[17]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[18]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[19]  Dan Klein,et al.  Learning Dependency-Based Compositional Semantics , 2011, CL.