Neural information retrieval: at the end of the early years

A recent “third wave” of neural network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work in this area is often referred to as deep learning. Recent years have witnessed an explosive growth of research into NN-based approaches to information retrieval (IR). A significant body of work has now been created. In this paper, we survey the current landscape of Neural IR research, paying special attention to the use of learned distributed representations of textual units. We highlight the successes of neural IR thus far, catalog obstacles to its wider adoption, and suggest potentially promising directions for future research.

[1]  Tapani Raiko,et al.  International Conference on Learning Representations (ICLR) , 2016 .

[2]  W. Bruce Croft,et al.  Query reformulation using anchor text , 2010, WSDM '10.

[3]  Maarten de Rijke,et al.  A Context-aware Time Model for Web Search , 2016, SIGIR.

[4]  Jason Weston,et al.  Question Answering with Subgraph Embeddings , 2014, EMNLP.

[5]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[6]  Jean-Pierre Chevallet,et al.  A Comparison of Deep Learning Based Query Expansion with Pseudo-Relevance Feedback and Mutual Information , 2016, ECIR.

[7]  Pinar Senkul,et al.  Utilizing Word Embeddings for Result Diversification in Tweet Search , 2015, AIRS.

[8]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[9]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[10]  Guido Zuccon,et al.  Integrating and Evaluating Neural Word Embeddings in Information Retrieval , 2015, ADCS.

[11]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[12]  Jianfeng Gao,et al.  Modeling Interestingness with Deep Neural Networks , 2014, EMNLP.

[13]  Xueqi Cheng,et al.  Learning Maximal Marginal Relevance Model via Directly Optimizing Diversity Evaluation Measures , 2015, SIGIR.

[14]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[15]  M. de Rijke,et al.  A Neural Click Model for Web Search , 2016, WWW.

[16]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[17]  Nemanja Djuric,et al.  Search Retargeting using Directed Query Embeddings , 2015, WWW.

[18]  Yang Song,et al.  Multi-Rate Deep Learning for Temporal Recommendation , 2016, SIGIR.

[19]  Xiaodong Liu,et al.  Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval , 2015, NAACL.

[20]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[21]  Bhaskar Mitra,et al.  Exploring Session Context using Distributed Representations of Queries and Reformulations , 2015, SIGIR.

[22]  Xugang Ye,et al.  Learning relevance from click data via neural network based similarity models , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[23]  Jure Leskovec,et al.  Inferring Networks of Substitutable and Complementary Products , 2015, KDD.

[24]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[25]  Ian H. Witten,et al.  The Reactive Keyboard: A Predicive Typing Aid , 1990, Computer.

[26]  Eduard H. Hovy,et al.  When Are Tree Structures Necessary for Deep Learning of Representations? , 2015, EMNLP.

[27]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[29]  Gang Wang,et al.  Selective Term Proximity Scoring Via BP-ANN , 2016, ArXiv.

[30]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[31]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[32]  David Novak,et al.  Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search , 2016, CIKM.

[33]  Jun Wang,et al.  Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction , 2016, ECIR.

[34]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[35]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[36]  Zhengdong Lu,et al.  Deep Learning for Information Retrieval , 2016, SIGIR.

[37]  Ido Guy,et al.  Personalized social search based on the user's social network , 2009, CIKM.

[38]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[39]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[40]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[41]  Chris Dyer,et al.  Notes on Noise Contrastive Estimation and Negative Sampling , 2014, ArXiv.

[42]  Lin Ma,et al.  Multimodal Convolutional Neural Networks for Matching Image and Sentence , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[43]  Jianfeng Gao,et al.  Clickthrough-based translation models for web search: from word models to phrase models , 2010, CIKM.

[44]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[45]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[46]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[47]  Xueqi Cheng,et al.  Modeling Document Novelty with Neural Tensor Network for Search Result Diversification , 2016, SIGIR.

[48]  Florent Perronnin,et al.  Aggregating Continuous Word Embeddings for Information Retrieval , 2013, CVSM@ACL.

[49]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[50]  W. Bruce Croft,et al.  Embedding-based Query Language Models , 2016, ICTIR.

[51]  HyvärinenAapo,et al.  Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics , 2012 .

[52]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[53]  Christopher D. Manning Understanding Human Language: Can NLP and Deep Learning Help? , 2016, SIGIR.

[54]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[55]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[56]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[57]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[58]  James P. Callan,et al.  Query Transformations for Result Merging , 2014, TREC.

[59]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[60]  Bhaskar Mitra,et al.  Query Auto-Completion for Rare Prefixes , 2015, CIKM.

[61]  Yoshua Bengio,et al.  Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[62]  Rui Yan,et al.  Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System , 2016, SIGIR.

[63]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[64]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[65]  Xueqi Cheng,et al.  Learning for search result diversification , 2014, SIGIR.

[66]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[67]  M. de Rijke,et al.  Learning Latent Vector Spaces for Product Search , 2016, CIKM.

[68]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[69]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[70]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing , 2011 .

[71]  Markus Koskela,et al.  LSTM-Based Predictions for Proactive Information Retrieval , 2016, SIGIR 2016.

[72]  W. Bruce Croft,et al.  Cluster-based retrieval using language models , 2004, SIGIR '04.

[73]  Wei Chu,et al.  Deep Learning Powered In-Session Contextual Ranking using Clickthrough Data , 2016 .

[74]  Allan Hanbury,et al.  Generalizing Translation Models in the Probabilistic Relevance Framework , 2016, CIKM.

[75]  Pu-Jen Cheng,et al.  Learning user reformulation behavior for query auto-completion , 2014, SIGIR.

[76]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[77]  Craig MacDonald,et al.  Using word embeddings in Twitter election classification , 2016, Information Retrieval Journal.

[78]  M. de Rijke,et al.  Building simulated queries for known-item topics: an analysis using six european languages , 2007, SIGIR.

[79]  Marcel Worring,et al.  Unsupervised, Efficient and Semantic Expertise Retrieval , 2016, WWW.

[80]  Javad Azimi,et al.  Ads Keyword Rewriting Using Search Engine Results , 2015, WWW.

[81]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[82]  Guido Zuccon,et al.  Medical Semantic Similarity with a Neural Language Model , 2014, CIKM.

[83]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[84]  Jianfeng Gao,et al.  Deep Learning for Web Search and Natural Language Processing , 2015 .

[85]  Jason Weston,et al.  Supervised Semantic Indexing , 2009, ECIR.

[86]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[87]  Michael Granitzer,et al.  Evaluating Memory Efficiency and Robustness of Word Embeddings , 2016, ECIR.

[88]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[89]  James P. Callan,et al.  Learning to Reweight Terms with Distributed Representations , 2015, SIGIR.

[90]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[91]  Donna K. Harman,et al.  Overview of the Reliable Information Access Workshop , 2009, Information Retrieval.

[92]  Marie-Francine Moens,et al.  Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings , 2015, SIGIR.

[93]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[94]  Hang Li,et al.  A Deep Architecture for Matching Short Texts , 2013, NIPS.

[95]  M. de Rijke,et al.  On the Assessment of Expertise Profiles , 2013, DIR.

[96]  Utpal Garain,et al.  Using Word Embeddings for Automatic Query Expansion , 2016, ArXiv.

[97]  Alessandro Moschitti,et al.  Semi-supervised Question Retrieval with Gated Convolutions , 2015, NAACL.

[98]  W. Bruce Croft,et al.  Improving Language Estimation with the Paragraph Vector Model for Ad-hoc Retrieval , 2016, SIGIR.

[99]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[100]  Quoc V. Le,et al.  Document Embedding with Paragraph Vectors , 2015, ArXiv.

[101]  Yee Whye Teh,et al.  A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[102]  W. Bruce Croft,et al.  Adaptability of Neural Networks on Varying Granularity IR Tasks , 2016, ArXiv.

[103]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[104]  Po Hu,et al.  Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering , 2015, ACL.

[105]  Xueqi Cheng,et al.  A Study of MatchPyramid Models on Ad-hoc Retrieval , 2016, ArXiv.

[106]  Allan Hanbury,et al.  Uncertainty in Neural Network Word Embedding: Exploration of Threshold for Similarity , 2016, ArXiv.

[107]  Gang Wang,et al.  RC-NET: A General Framework for Incorporating Knowledge into Word Representations , 2014, CIKM.

[108]  M. de Rijke,et al.  Time-sensitive Personalized Query Auto-Completion , 2014, CIKM.

[109]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[110]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[111]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[112]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[113]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[114]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[115]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[116]  Bhaskar Mitra,et al.  Improving Document Ranking with Dual Word Embeddings , 2016, WWW.

[117]  M. de Rijke,et al.  Click Models for Web Search , 2015, Click Models for Web Search.

[118]  John D. Lafferty,et al.  Information Retrieval as Statistical Translation , 2017 .

[119]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[120]  Stephen E. Robertson,et al.  Understanding inverse document frequency: on theoretical arguments for IDF , 2004, J. Documentation.

[121]  Thomas B. Moeslund,et al.  Learning Dynamic Classes of Events using Stacked Multilayer Perceptron Networks , 2016, SIGIR 2016.

[122]  Milad Shokouhi,et al.  Time-sensitive query auto-completion , 2012, SIGIR '12.

[123]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[124]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[125]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[126]  Yann LeCun,et al.  Very Deep Convolutional Networks for Natural Language Processing , 2016, ArXiv.

[127]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[128]  Fabrizio Silvestri,et al.  Context- and Content-aware Embeddings for Query Rewriting in Sponsored Search , 2015, SIGIR.

[129]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[130]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[131]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[132]  Lin Ma,et al.  Learning to Answer Questions from Image Using Convolutional Neural Network , 2015, AAAI.

[133]  Georgios Balikas,et al.  An empirical study on large scale text classification with skip-gram embeddings , 2016, ArXiv.

[134]  Noah A. Smith,et al.  What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA , 2007, EMNLP.

[135]  Erik Ordentlich,et al.  Network-Efficient Distributed Word2vec Training System for Large Vocabularies , 2016, CIKM.

[136]  Nick Craswell,et al.  Query Expansion with Locally-Trained Word Embeddings , 2016, ACL.

[137]  Ye Zhang,et al.  A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification , 2015, IJCNLP.

[138]  Rabab Kreidieh Ward,et al.  Semantic Modelling with Long-Short-Term Memory for Information Retrieval , 2014, ArXiv.

[139]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[140]  M. de Rijke,et al.  A Survey of Query Auto Completion in Information Retrieval , 2016, Found. Trends Inf. Retr..

[141]  Frank E. Pollick,et al.  Understanding Information Need: An fMRI Study , 2016, SIGIR.

[142]  Ye Zhang,et al.  MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification , 2016, NAACL.

[143]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[144]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[145]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[146]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[147]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[148]  W. Bruce Croft,et al.  Estimating Embedding Vectors for Queries , 2016, ICTIR.

[149]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[150]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[151]  James Allan,et al.  A Comparative Study of Utilizing Topic Models for Information Retrieval , 2009, ECIR.

[152]  Zhiyong Lu,et al.  Bridging the Gap: a Semantic Similarity Measure between Queries and Documents , 2016, ArXiv.

[153]  Yoshua Bengio,et al.  Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization , 2014, AAAI.

[154]  Hang Li,et al.  Semantic Matching in Search , 2014, SMIR@SIGIR.

[155]  Gareth J. F. Jones,et al.  Representing Documents and Queries as Sets of Word Embedded Vectors for Information Retrieval , 2016, ArXiv.

[156]  W. Bruce Croft,et al.  An Optimization Framework for Merging Multiple Result Lists , 2015, CIKM.

[157]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[158]  Gareth J. F. Jones,et al.  Word Vector Compositionality based Relevance Feedback using Kernel Density Estimation , 2016, CIKM.

[159]  Jiafeng Guo,et al.  Analysis of the Paragraph Vector Model for Information Retrieval , 2016, ICTIR.

[160]  Jakob Grue Simonsen,et al.  Deep Learning Relevance: Creating Relevant Information (as Opposed to Retrieving it) , 2016, ArXiv.

[161]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[162]  Parth Gupta,et al.  Query expansion for mixed-script information retrieval , 2014, SIGIR.

[163]  Lei Yu,et al.  Deep Learning for Answer Sentence Selection , 2014, ArXiv.

[164]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[165]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[166]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[167]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[168]  Blockin Blockin,et al.  Quick Training of Probabilistic Neural Nets by Importance Sampling , 2003 .

[169]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[170]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[171]  Hao Wu,et al.  Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content , 2015, WWW.

[172]  Kyunghyun Cho,et al.  Natural Language Understanding with Distributed Representation , 2015, ArXiv.

[173]  Zhong Zhou,et al.  Tweet2Vec: Character-Based Distributed Representations for Social Media , 2016, ACL.

[174]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[175]  Craig MacDonald,et al.  Modelling User Preferences using Word Embeddings for Context-Aware Venue Recommendation , 2016, ArXiv.

[176]  Philippe Mulhem,et al.  Toward Word Embedding for Personalized Information Retrieval , 2016, SIGIR 2016.

[177]  James L. McClelland Parallel Distributed Processing , 2005 .

[178]  Jimmy J. Lin,et al.  Web question answering: is more always better? , 2002, SIGIR '02.

[179]  Laure Soulier,et al.  Toward a Deep Neural Approach for Knowledge-Based IR , 2016, SIGIR 2016.

[180]  Ziv Bar-Yossef,et al.  Context-sensitive query auto-completion , 2011, WWW.

[181]  M. de Rijke,et al.  Pseudo test collections for training and tuning microblog rankers , 2013, SIGIR.

[182]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[183]  Wenlin Chen,et al.  Strategies for Training Large Vocabulary Neural Language Models , 2015, ACL.

[184]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[185]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[186]  Xuanjing Huang,et al.  Continuous word embeddings for detecting local text reuses at the semantic level , 2014, SIGIR.

[187]  Felix Hill,et al.  Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.

[188]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP] , 2011, IEEE Signal Processing Magazine.

[189]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[190]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[191]  Zhongfei Zhang,et al.  Attention Based Recurrent Neural Networks for Online Advertising , 2016, WWW.

[192]  James Allan,et al.  Fast query expansion using approximations of relevance models , 2010, CIKM.

[193]  Xiao Ma,et al.  From Word Embeddings to Document Similarities for Improved Information Retrieval in Software Engineering , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[194]  M. de Rijke,et al.  Learning from homologous queries and semantically related terms for query auto completion , 2016, Inf. Process. Manag..

[195]  Xuehua Shen,et al.  iPinYou Global RTB Bidding Algorithm Competition Dataset , 2014, ADKDD'14.

[196]  Mark Levene,et al.  Search Engines: Information Retrieval in Practice , 2011, Comput. J..

[197]  I. Witten,et al.  The Reactive Keyboard: a predictive typing aid , 1990, Computer.

[198]  Zhongfei Zhang,et al.  DeepIntent: Learning Attentions for Online Advertising with Recurrent Neural Networks , 2016, KDD.

[199]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[200]  Rabab Kreidieh Ward,et al.  Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[201]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[202]  Mandar Mitra,et al.  Word Embedding based Generalized Language Model for Information Retrieval , 2015, SIGIR.

[203]  M. de Rijke,et al.  Short Text Similarity with Word Embeddings , 2015, CIKM.

[204]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[205]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[206]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[207]  Manoj Kumar Chinnakotla,et al.  Deep Feature Fusion Network for Answer Quality Prediction in Community Question Answering , 2016, ArXiv.

[208]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[209]  Peter Glöckner,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2013 .

[210]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[211]  Bhaskar Mitra,et al.  A Dual Embedding Space Model for Document Ranking , 2016, ArXiv.

[212]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[213]  Ye Zhang,et al.  Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization , 2017, ACL.

[214]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[215]  Marc'Aurelio Ranzato,et al.  Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews , 2014, ICLR.

[216]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[217]  W. Bruce Croft,et al.  aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model , 2016, CIKM.

[218]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[219]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[220]  Li Deng,et al.  A tutorial survey of architectures, algorithms, and applications for deep learning , 2014, APSIPA Transactions on Signal and Information Processing.