FastHybrid: A Hybrid Model for Efficient Answer Selection

Answer selection is a core component in any question-answering systems. It aims to select correct answer sentences for a given question from a pool of candidate sentences. In recent years, many deep learning methods have been proposed and shown excellent results for this task. However, these methods typically require extensive parameter (and hyper-parameter) tuning, which give rise to efficiency issues for large-scale datasets, and potentially make them less portable across new datasets and domains (as re-tuning is usually required). In this paper, we propose an extremely efficient hybrid model (FastHybrid) that tackles the problem from both an accuracy and scalability point of view. FastHybrid is a light-weight model that requires little tuning and adaptation across different domains. It combines a fast deep model (which will be introduced in the method section) with an initial information retrieval model to effectively and efficiently handle answer selection. We introduce a new efficient attention mechanism in the hybrid model and demonstrate its effectiveness on several QA datasets. Experimental results show that although the hybrid uses no training data, its accuracy is often on-par with supervised deep learning techniques, while significantly reducing training and tuning costs across different domains.

[1]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[2]  W. Bruce Croft,et al.  Learning concept importance using a weighted dependence model , 2010, WSDM '10.

[3]  Trishul M. Chilimbi,et al.  Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.

[4]  Yaoliang Yu,et al.  Petuum: A New Platform for Distributed Machine Learning on Big Data , 2013, IEEE Transactions on Big Data.

[5]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[6]  Harris Wu,et al.  Evaluating Web-based Question Answering Systems , 2002, LREC.

[7]  Alessandro Moschitti,et al.  Automatic Feature Engineering for Answer Selection and Extraction , 2013, EMNLP.

[8]  Charles L. A. Clarke,et al.  Term proximity scoring for ad-hoc retrieval on very large text collections , 2006, SIGIR.

[9]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[10]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[11]  David A. Ferrucci,et al.  Introduction to "This is Watson" , 2012, IBM J. Res. Dev..

[12]  Tao Tao,et al.  An exploration of proximity measures in information retrieval , 2007, SIGIR.

[13]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Sumit Chopra,et al.  DLID: Deep Learning for Domain Adaptation by Interpolating between Domains , 2013 .

[16]  Honglak Lee,et al.  Online Incremental Feature Learning with Denoising Autoencoders , 2012, AISTATS.

[17]  Iadh Ounis,et al.  Query performance prediction , 2006, Inf. Syst..

[18]  Lei Yu,et al.  Deep Learning for Answer Sentence Selection , 2014, ArXiv.

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20]  Bowen Zhou,et al.  Improved Representation Learning for Question Answer Matching , 2016, ACL.

[21]  Ellen M. Voorhees,et al.  Overview of the TREC-9 Question Answering Track , 2000, TREC.

[22]  Oren Etzioni Search needs a shake-up , 2011, Nature.

[23]  Bowen Zhou,et al.  Attentive Pooling Networks , 2016, ArXiv.

[24]  Bowen Zhou,et al.  Applying deep learning to answer selection: A study and an open task , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[25]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[26]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[27]  Djoerd Hiemstra,et al.  The Combination and Evaluation of Query Performance Prediction Methods , 2009, ECIR.

[28]  Sanja Fidler,et al.  Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[30]  Feng Xia,et al.  Introduction to , 2015, ACM Trans. Multim. Comput. Commun. Appl..

[31]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[32]  Ming-Wei Chang,et al.  Question Answering Using Enhanced Lexical Semantic Models , 2013, ACL.

[33]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[34]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.