Learning What is Essential in Questions

Question answering (QA) systems are easily distracted by irrelevant or redundant words in questions, especially when faced with long or multi-sentence questions in difficult domains. This paper introduces and studies the notion of essential question terms with the goal of improving such QA solvers. We illustrate the importance of essential question terms by showing that humans’ ability to answer questions drops significantly when essential terms are eliminated from questions.We then develop a classifier that reliably (90% mean average precision) identifies and ranks essential terms in questions. Finally, we use the classifier to demonstrate that the notion of question term essentiality allows state-of-the-art QA solver for elementary-level science questions to make better and more informed decisions,improving performance by up to 5%.We also introduce a new dataset of over 2,200 crowd-sourced essential terms annotated science questions.

[1]  Peter D. Turney Distributional Semantics Beyond Words: Supervised Learning of Analogy and Paraphrase , 2013, TACL.

[2]  Oren Etzioni,et al.  Exploring Markov Logic Networks for Question Answering , 2015, EMNLP.

[3]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[4]  Sanda M. Harabagiu,et al.  Performance issues and error analysis in an open-domain question answering system , 2003, TOIS.

[5]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[6]  Oren Etzioni,et al.  Learning to Solve Arithmetic Word Problems with Verb Categorization , 2014, EMNLP.

[7]  Parisa Kordjamshidi,et al.  EDISON: Feature Extraction for NLP, Simplified , 2016, LREC.

[8]  Peter Clark Elementary School Science and Math Tests as a Driver for AI: Take the Aristo Challenge! , 2015, AAAI.

[9]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[10]  D. C. Howell Statistical Methods for Psychology , 1987 .

[11]  Christopher Meek,et al.  Semantic Parsing for Single-Relation Question Answering , 2014, ACL.

[12]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[13]  Sadaoki Furui,et al.  Speech Summarization: An Approach through Word Extraction and a Method for Evaluation , 2004, IEICE Trans. Inf. Syst..

[14]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[15]  Peter Jansen,et al.  Framing QA as Building and Ranking Intersentence Answer Justifications , 2017, CL.

[16]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[17]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[18]  W. Bruce Croft,et al.  Using Key Concepts in a Translation Model for Retrieval , 2015, SIGIR.

[19]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[20]  Jun Zhao,et al.  Inner Attention based Recurrent Neural Networks for Answer Selection , 2016, ACL.

[21]  Wenpeng Yin,et al.  Attention-Based Convolutional Neural Network for Machine Comprehension , 2016, ArXiv.

[22]  Parisa Kordjamshidi,et al.  Better call Saul: Flexible Programming for Learning and Inference in NLP , 2016, COLING.

[23]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[24]  Daniele Bonadiman,et al.  Convolutional Neural Networks vs. Convolution Kernels: Feature Engineering for Answer Sentence Reranking , 2016, NAACL.

[25]  Parisa Kordjamshidi,et al.  Saul: Towards Declarative Learning Based Programming , 2015, IJCAI.

[26]  Xian Zhang,et al.  Classifying What-Type Questions by Head Noun Tagging , 2008, COLING.

[27]  Oren Etzioni,et al.  Question Answering via Integer Programming over Semi-Structured Knowledge , 2016, IJCAI.

[28]  Oren Etzioni,et al.  Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions , 2016, AAAI.