A content search method for security topics in microblog based on deep reinforcement learning

Traditional methods treat the search problem as a process of selecting and ranking sequential documents. The methods have been proved effective and are widely used in the web search domain. However, due to the complexity and particularity of microblog text contents, the classical methods are rarely used microblog searches for specific topics. Focusing on the issue of searching for specific topics in microblog content, we present a microblog search method for security topics based on deep reinforcement learning by modeling the microblog search for specific topics as a continuous-state Markov decision process. We also design a novel deep Q network to evaluate the relevance of microblog content based on the target topic. We adopt reinforcement learning to solve the microblog search problem using an intelligent strategy and evaluate content relevance through deep learning. Experiments conducted on a real-world dataset show that our approach outperforms the selected baseline methods.

[1]  Paul Hubert Vossen,et al.  User- Perceived Quality of Interactive Systems , 1997 .

[2]  Prasenjit Majumder,et al.  Information Extraction from Microblog for Disaster Related Event , 2017, SMERP@ECIR.

[3]  Jimmy J. Lin,et al.  Fast candidate generation for real-time tweet search with bloom filter chains , 2013, TOIS.

[4]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[5]  M. Puterman Chapter 8 Markov decision processes , 1990 .

[6]  Tie-Yan Liu,et al.  Ranking-Oriented Collaborative Filtering , 2016 .

[7]  Jai E. Jung,et al.  Real-time event detection for online behavioral analysis of big social data , 2017, Future Gener. Comput. Syst..

[8]  Krithi Ramamritham,et al.  Keyword Search on microblog Data Streams : Finding Contextual Messages in Real Time , 2016 .

[9]  Behzad Moshiri,et al.  Learning to rank with click-through features in a reinforcement learning framework , 2016, Int. J. Web Inf. Syst..

[10]  Fernando Diaz,et al.  CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises , 2014, ICWSM.

[11]  Aoying Zhou,et al.  Top-k temporal keyword search over social media data , 2016, World Wide Web.

[12]  Xueqi Cheng,et al.  Directly Optimize Diversity Evaluation Measures , 2017, ACM Trans. Intell. Syst. Technol..

[13]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[14]  Daling Wang,et al.  Attention based hierarchical LSTM network for context-aware microblog sentiment classification , 2018, World Wide Web.

[15]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[16]  Rodriguez Perez,et al.  Microblog retrieval challenges and opportunities , 2018 .

[17]  Xiuzhen Zhang,et al.  A probabilistic method for emerging topic tracking in Microblog stream , 2016, World Wide Web.

[18]  S. Shankar Sastry,et al.  Markov Decision Process Routing Games , 2017, 2017 ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS).

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20]  Grace Hui Yang,et al.  Win-win search: dual-agent stochastic game in session search , 2014, SIGIR.

[21]  Junping Du,et al.  Social network search based on semantic analysis and learning , 2016, CAAI Trans. Intell. Technol..

[22]  Yiqun Liu,et al.  Understanding and Predicting Usefulness Judgment in Web Search , 2017, SIGIR.

[23]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[24]  Mimmo Parente,et al.  Time-aware adaptive tweets ranking through deep learning , 2017, Future Gener. Comput. Syst..

[25]  Haim Levkowitz,et al.  Introduction to information retrieval (IR) , 2008 .

[26]  Heyan Huang,et al.  Query Expansion Based on a Feedback Concept Model for Microblog Retrieval , 2017, WWW.

[27]  Sheng-De Wang,et al.  An efficient multicharacter transition string-matching engine based on the aho-corasick algorithm , 2013, ACM Trans. Archit. Code Optim..

[28]  Beng Chin Ooi,et al.  TI: an efficient indexing mechanism for real-time search on tweets , 2011, SIGMOD '11.

[29]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[30]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[31]  Somprakash Bandyopadhyay,et al.  Microblog Retrieval in a Disaster Situation: A New Test Collection for Evaluation , 2017, SMERP@ECIR.

[32]  Wei Zeng,et al.  Adapting Markov Decision Process for Search Result Diversification , 2017, SIGIR.

[33]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[34]  Behzad Moshiri,et al.  Integration of data fusion and reinforcement learning techniques for the rank-aggregation problem , 2016, Int. J. Mach. Learn. Cybern..

[35]  Jimmy J. Lin,et al.  Earlybird: Real-Time Search at Twitter , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[36]  Xiaodong Liu,et al.  Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval , 2015, NAACL.

[37]  Qinmin Hu,et al.  TAKer: Fine-Grained Time-Aware Microblog Search with Kernel Density Estimation , 2018, IEEE Transactions on Knowledge and Data Engineering.

[38]  Rui Zhang,et al.  A Study on the Analysis Model of the Ranking of the Theme of Weibo , 2018, Int. J. Pattern Recognit. Artif. Intell..

[39]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[40]  Tamer Elsayed,et al.  Query performance prediction for microblog search , 2017, Inf. Process. Manag..

[41]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[42]  Luis Herranz,et al.  Scene Recognition with CNNs: Objects, Scales and Dataset Bias , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[44]  Cláudio T. Silva,et al.  TopKube: A Rank-Aware Data Cube for Real-Time Exploration of Spatiotemporal Data , 2017, IEEE Transactions on Visualization and Computer Graphics.

[45]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[46]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[47]  M. de Rijke,et al.  A Neural Click Model for Web Search , 2016, WWW.

[48]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[49]  Frank L. Lewis,et al.  Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis , 2017, IEEE Transactions on Cybernetics.

[50]  Daling Wang,et al.  A word-emoticon mutual reinforcement ranking model for building sentiment lexicon from massive collection of microblogs , 2014, World Wide Web.

[51]  Zonghua Gu,et al.  Real-time and precise insect flight control system based on virtual reality , 2017 .

[52]  Ben He,et al.  Query-biased learning to rank for real-time twitter search , 2012, CIKM.

[53]  Somprakash Bandyopadhyay,et al.  A Novel Word Embedding Based Stemming Approach for Microblog Retrieval During Disasters , 2017, ECIR.

[54]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[55]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[56]  Xinhang Song,et al.  Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold , 2017, IEEE Transactions on Image Processing.

[57]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[58]  John D. Lafferty,et al.  A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval , 2017, SIGF.

[59]  Jiafeng Guo,et al.  Reinforcement Learning to Rank with Markov Decision Process , 2017, SIGIR.

[60]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[61]  Huanbo Luan,et al.  Compact Indexing and Judicious Searching for Billion-Scale Microblog Retrieval , 2017, ACM Trans. Inf. Syst..

[62]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[63]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[64]  Tao Liu,et al.  Ranking Learning Algorithm of Information Retrieval based on WeChat Public Numbers , 2017, ICIE '17.

[65]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[66]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.