Integrating Lexical and Temporal Signals in Neural Ranking Models for Searching Social Media Streams

Time is an important relevance signal when searching streams of social media posts. The distribution of document timestamps from the results of an initial query can be leveraged to infer the distribution of relevant documents, which can then be used to rerank the initial results. Previous experiments have shown that kernel density estimation is a simple yet effective implementation of this idea. This paper explores an alternative approach to mining temporal signals with recurrent neural networks. Our intuition is that neural networks provide a more expressive framework to capture the temporal coherence of neighboring documents in time. To our knowledge, we are the first to integrate lexical and temporal signals in an end-to-end neural network architecture, in which existing neural ranking models are used to generate query-document similarity vectors that feed into a bidirectional LSTM layer for temporal modeling. Our results are mixed: existing neural models for document ranking alone yield limited improvements over simple baselines, but the integration of lexical and temporal signals yield significant improvements over competitive temporal baselines.

[1]  W. Bruce Croft,et al.  Time-based language models , 2003, CIKM '03.

[2]  Luis Gravano,et al.  Answering General Time-Sensitive Queries , 2012, IEEE Trans. Knowl. Data Eng..

[3]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[4]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[5]  Jimmy J. Lin,et al.  Temporal Query Expansion Using a Continuous Hidden Markov Model , 2016, ICTIR.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Yang Song,et al.  Multi-Rate Deep Learning for Temporal Recommendation , 2016, SIGIR.

[8]  Gilad Mishne,et al.  Fast data in the era of big data: Twitter's real-time related query suggestion architecture , 2012, SIGMOD '13.

[9]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[10]  Jimmy J. Lin,et al.  UMD-TTIC-UW at SemEval-2016 Task 1: Attention-Based Multi-Perspective Convolutional Neural Networks for Textual Similarity Measurement , 2016, *SEMEVAL.

[11]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[12]  Susan T. Dumais,et al.  Modeling and predicting behavioral dynamics on the web , 2012, WWW.

[13]  Miles Efron,et al.  Estimation methods for ranking recent information , 2011, SIGIR.

[14]  Jimmy J. Lin,et al.  Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement , 2016, NAACL.

[15]  W. Bruce Croft,et al.  Temporal models for microblogs , 2012, CIKM.

[16]  Jimmy J. Lin,et al.  Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks , 2017, CIKM.

[17]  Fernando Diaz,et al.  Time is of the essence: improving recency ranking using Twitter data , 2010, WWW '10.

[18]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[19]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[20]  Jimmy J. Lin,et al.  Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks , 2016, CIKM.

[21]  Jimmy J. Lin,et al.  Temporal feedback for tweet search with non-parametric density estimation , 2014, SIGIR.

[22]  Kyunghyun Cho,et al.  Task-Oriented Query Reformulation with Reinforcement Learning , 2017, EMNLP.

[23]  Milad Shokouhi,et al.  Time-sensitive query auto-completion , 2012, SIGIR '12.

[24]  Jimmy J. Lin,et al.  Experiments with Convolutional Neural Network Models for Answer Selection , 2017, SIGIR.

[25]  Jimmy J. Lin,et al.  Compressing and Decoding Term Statistics Time Series , 2016, ECIR.

[26]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[27]  Milad Shokouhi,et al.  Behavioral dynamics on the web: Learning, modeling, and prediction , 2013, TOIS.

[28]  Jimmy J. Lin,et al.  Mining the Temporal Statistics of Query Terms for Searching Social Media Posts , 2017, ICTIR.

[29]  Joaquim Macedo,et al.  Query Expansion with Temporal Segmented Texts , 2014, ECIR.

[30]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[31]  Susan T. Dumais,et al.  Leveraging temporal dynamics of document content in relevance ranking , 2010, WSDM '10.

[32]  Jimmy J. Lin,et al.  Reproducible Experiments on Lexical and Temporal Feedback for Tweet Search , 2015, ECIR.

[33]  Peng Zhang,et al.  IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.

[34]  W. Bruce Croft,et al.  Relevance-based Word Embedding , 2017, SIGIR.

[35]  Mostafa Keikha,et al.  TEMPER: A Temporal Relevance Feedback Method , 2011, ECIR.

[36]  Jimmy J. Lin,et al.  Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks , 2015, EMNLP.

[37]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[38]  Gilad Mishne,et al.  Towards recency ranking in web search , 2010, WSDM '10.