A Neural Network for Factoid Question Answering over Paragraphs

Text classification methods for tasks like factoid question answering typically use manually defined string matching rules or bag of words representations. These methods are ineective when question text contains very few individual words (e.g., named entities) that are indicative of the answer. We introduce a recursive neural network (rnn) model that can reason over such input by modeling textual compositionality. We apply our model, qanta, to a dataset of questions from a trivia competition called quiz bowl. Unlike previous rnn models, qanta learns word and phrase-level representations that combine across sentences to reason about entities. The model outperforms multiple baselines and, when combined with information retrieval methods, rivals the best human players.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Mengqiu Wang,et al.  A Survey of Answer Extraction Techniques in Factoid Question Answering , 2006 .

[3]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[4]  Percy Liang,et al.  Zero-shot Entity Extraction from Web Pages , 2014, ACL.

[5]  Jaime G. Carbonell,et al.  Rank learning for factoid question answering with linguistic and semantic constraints , 2010, CIKM.

[6]  Jordan L. Boyd-Graber,et al.  Besting the Quiz Master: Crowdsourcing Incremental Classification Games , 2012, EMNLP.

[7]  Patrick Gallinari,et al.  Ranking with ordered weighted pairwise classification , 2009, ICML '09.

[8]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[9]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[10]  Claire Cardie,et al.  Compositional Matrix-Space Models for Sentiment Analysis , 2011, EMNLP.

[11]  Katrin Erk,et al.  Vector Space Models of Word Meaning and Phrase Meaning: A Survey , 2012, Lang. Linguistics Compass.

[12]  Phil Blunsom,et al.  “Not not bad” is not “bad”: A distributional account of negation , 2013, CVSM@ACL.

[13]  Noah A. Smith,et al.  What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA , 2007, EMNLP.

[14]  Philip Resnik,et al.  Political Ideology Detection Using Recursive Neural Networks , 2014, ACL.

[15]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[16]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[17]  Ioannis Korkontzelos,et al.  Estimating Linear Models for Compositional Distributional Semantics , 2010, COLING.

[18]  Jason Weston,et al.  WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[19]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[20]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[21]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[22]  Quoc V. Le,et al.  Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.

[23]  Mehrnoosh Sadrzadeh,et al.  Multi-Step Regression Learning for Compositional Distributional Semantics , 2013, IWCS.

[24]  Siddharth Patwardhan,et al.  Question analysis: How Watson reads a clue , 2012, IBM J. Res. Dev..

[25]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[26]  Phil Blunsom,et al.  The Role of Syntax in Vector Space Models of Compositional Semantics , 2013, ACL.

[27]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[28]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[29]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[30]  Jason Weston,et al.  Semantic Frame Identification with Distributed Word Representations , 2014, ACL.

[31]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[32]  Jordan L. Boyd-Graber,et al.  Grammatical structures for word-level sentiment detection , 2012, NAACL.

[33]  Eugene Agichtein,et al.  Finding the right facts in the crowd: factoid question answering over social media , 2008, WWW.

[34]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[35]  Gerhard Weikum,et al.  YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.