DeepIntent: Learning Attentions for Online Advertising with Recurrent Neural Networks

In this paper, we investigate the use of recurrent neural networks (RNNs) in the context of search-based online advertising. We use RNNs to map both queries and ads to real valued vectors, with which the relevance of a given (query, ad) pair can be easily computed. On top of the RNN, we propose a novel attention network, which learns to assign attention scores to different word locations according to their intent importance (hence the name DeepIntent). The vector output of a sequence is thus computed by a weighted sum of the hidden states of the RNN at each word according their attention scores. We perform end-to-end training of both the RNN and attention network under the guidance of user click logs, which are sampled from a commercial search engine. We show that in most cases the attention network improves the quality of learned vector representations, evaluated by AUC on a manually labeled dataset. Moreover, we highlight the effectiveness of the learned attention scores from two aspects: query rewriting and a modified BM25 metric. We show that using the learned attention scores, one is able to produce sub-queries that are of better qualities than those of the state-of-the-art methods. Also, by modifying the term frequency with the attention scores in a standard BM25 formula, one is able to improve its performance evaluated by AUC.

[1]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  W. Bruce Croft,et al.  Compact query term selection using topically related text , 2013, SIGIR.

[4]  Ramón Fernández Astudillo,et al.  Not All Contexts Are Created Equal: Better Word Representations with Variable Attention , 2015, EMNLP.

[5]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[6]  Rosie Jones,et al.  Query word deletion prediction , 2003, SIGIR.

[7]  Rabab Kreidieh Ward,et al.  Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Kevyn Collins-Thompson,et al.  Query expansion using random walk models , 2005, CIKM '05.

[9]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[10]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[11]  Javad Azimi,et al.  Contextual Query Intent Extraction for Paid Search Selection , 2015, WWW.

[12]  ChengXiang Zhai,et al.  Lower-bounding term frequency normalization , 2011, CIKM '11.

[13]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[15]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[16]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[17]  Rabab K. Ward,et al.  Deep Sentence Embedding Using Long Short-Term Memory Networks; arXiv:1502.06922 , 2015 .

[18]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[19]  Luo Si,et al.  Learn to weight terms in information retrieval using category information , 2005, ICML.

[20]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[21]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[22]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[23]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[24]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[25]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[26]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.