Security topics related microblogs search based on deep convolutional neural networks

Abstract Social network information search, especially for microblog search, has been one of the research hotspots in the domain of information search. For complexities of microblog data on arbitrary typing and semantic ambiguity, classical approaches cannot be directly adopted. In this paper, we propose a security topics related microblogs search model based on deep convolutional neural networks (DCNN-CSTRS) to search microblogs similar to a specific security topic contents. This method is trained to capture local semantic features of short microblog texts to filter security topics related contents from microblogs. A matching model based on deep convolution neural network is designed to rank the results by matching the extracted local features of queries and documents respectively through non-linear feature transformations of the convolution and pooling. The matching model ranks the pairs of query-document by similarities. Experimental results demonstrate that the proposed approach performs better compared with the state-of-the-art methods.

[1]  Shichao Zhang,et al.  Self-representation nearest neighbor search for classification , 2016, Neurocomputing.

[2]  Jianfeng Gao,et al.  Deep stacking networks for information retrieval , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Junfei Qiao,et al.  An adaptive growing and pruning algorithm for designing recurrent neural network , 2017, Neurocomputing.

[4]  Junping Du,et al.  Social network search based on semantic analysis and learning , 2016, CAAI Trans. Intell. Technol..

[5]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[6]  Guido Zuccon,et al.  Integrating and Evaluating Neural Word Embeddings in Information Retrieval , 2015, ADCS.

[7]  Hai Anh Tran,et al.  A LSTM based framework for handling multiclass imbalance in DGA botnet detection , 2018, Neurocomputing.

[8]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[9]  Jie Yin,et al.  Using Social Media to Enhance Emergency Situation Awareness , 2012, IEEE Intelligent Systems.

[10]  Yanxia Lu,et al.  A study on micro-blog sentiment analysis of public emergencies under the environment of big data , 2017, 2017 29th Chinese Control And Decision Conference (CCDC).

[11]  Cherif Chiraz Latiri,et al.  Short Query Expansion for Microblog Retrieval , 2016, KES.

[12]  Mohan S. Kankanhalli,et al.  Online object tracking based on CNN with spatial-temporal saliency guided sampling , 2017, Neurocomputing.

[13]  Xuanjing Huang,et al.  Learning Topical Translation Model for Microblog Hashtag Suggestion , 2013, IJCAI.

[14]  Sai Ji,et al.  Towards efficient content-aware search over encrypted outsourced data in cloud , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[15]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR Forum.

[16]  Meredith Ringel Morris,et al.  #TwitterSearch: a comparison of microblog search and web search , 2011, WSDM '11.

[17]  Ray R. Larson Introduction to Information Retrieval , 2010 .

[18]  Gareth J. F. Jones,et al.  Retrievability of Code Mixed Microblogs , 2016, SIGIR.

[19]  Junping Du,et al.  A semantic modeling method for social network short text based on spatial and temporal characteristics , 2017, J. Comput. Sci..

[20]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[21]  Xuan Wang,et al.  Quality biased multimedia data retrieval in microblogs , 2016, J. Vis. Commun. Image Represent..

[22]  Zhiyuan Liu,et al.  Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search , 2018, WSDM.

[23]  Omar El Beqqali,et al.  Microblog semantic context retrieval system based on linked open data and graph-based theory , 2016, Expert Syst. Appl..

[24]  Sukomal Pal,et al.  Microblog Retrieval for Disaster Relief: How To Create Ground Truths? , 2017, SMERP@ECIR.

[25]  Jingkuan Song,et al.  Real-time social media retrieval with spatial, temporal and social constraints , 2017, Neurocomputing.

[26]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[27]  Dong Liu,et al.  MIX: Multi-Channel Information Crossing for Text Matching , 2018, KDD.

[28]  Hila Becker,et al.  Learning similarity metrics for event identification in social media , 2010, WSDM '10.

[29]  Xingming Sun,et al.  Toward Efficient Multi-Keyword Fuzzy Search Over Encrypted Outsourced Data With Accuracy Improvement , 2016, IEEE Transactions on Information Forensics and Security.

[30]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[31]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[32]  Prasenjit Majumder,et al.  Information Extraction from Microblog for Disaster Related Event , 2017, SMERP@ECIR.

[33]  Miles Efron,et al.  Information search and retrieval in microblogs , 2011, J. Assoc. Inf. Sci. Technol..

[34]  Kripabandhu Ghosh,et al.  Overview of the FIRE 2016 Microblog track: Information Extraction from Microblogs Posted during Disasters , 2016, FIRE.

[35]  Chunyan Miao,et al.  PSDVec: a Toolbox for Incremental and Scalable Word Embedding , 2016, Neurocomputing.

[36]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[37]  Yong Zhang,et al.  Concept Embedded Convolutional Semantic Model for Question Retrieval , 2017, WSDM.

[38]  Zhengdong Lu,et al.  Deep Learning for Information Retrieval , 2016, SIGIR.

[39]  Makarand Hastak,et al.  Emergency information diffusion on online social media during storm Cindy in U.S , 2018, Int. J. Inf. Manag..

[40]  Sarah Vieweg,et al.  Processing Social Media Messages in Mass Emergency , 2014, ACM Comput. Surv..

[41]  Makarand Hastak,et al.  Social network analysis: Characteristics of online social networks after a disaster , 2018, Int. J. Inf. Manag..

[42]  Jiwen Lu,et al.  Deep Coupled Metric Learning for Cross-Modal Matching , 2017, IEEE Transactions on Multimedia.

[43]  Dilip Kumar Sharma,et al.  A spatial, temporal and sentiment based framework for indexing and clustering in twitter blogosphere , 2017, J. Intell. Fuzzy Syst..

[44]  Yurong Liu,et al.  A survey of deep neural network architectures and their applications , 2017, Neurocomputing.

[45]  Makarand Hastak,et al.  Online Human Behaviors on Social Media during Disaster Responses , 2017 .

[46]  Xiao Lin,et al.  Research on emergency dissemination models for social media based on information entropy , 2018, Enterp. Inf. Syst..

[47]  Fernando Diaz,et al.  CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises , 2014, ICWSM.

[48]  Howard D. White Bag of works retrieval: TF*IDF weighting of works co-cited with a seed , 2017, International Journal on Digital Libraries.

[49]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[50]  Jianmin Wang,et al.  Deep Hashing Network for Efficient Similarity Retrieval , 2016, AAAI.

[51]  Jason J. Jung,et al.  Spatio-Temporal Contextualization of Queries for Microtexts in Social Media: Mathematical Modeling , 2017, EUSPN/ICTH.

[52]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR Forum.

[53]  B. Karlin,et al.  Communicating flood risk: Looking back and forward at traditional and social media outlets , 2016 .

[54]  Huanbo Luan,et al.  Compact Indexing and Judicious Searching for Billion-Scale Microblog Retrieval , 2017, ACM Trans. Inf. Syst..

[55]  David S. Ebert,et al.  Social Media Visual Analytic Toolkits for Disaster Management: A Review of the Literature , 2017, ISCRAM.

[56]  Gareth J. F. Jones,et al.  An Embedding Based IR Model for Disaster Situations , 2018, Inf. Syst. Frontiers.

[57]  Stuart E. Middleton,et al.  Real-Time Crisis Mapping of Natural Disasters Using Social Media , 2014, IEEE Intelligent Systems.

[58]  John Lafferty,et al.  Information retrieval as statistical translation , 1999, SIGIR 1999.

[59]  Hainan Zhao,et al.  Live multimedia brand-related data identification in microblog , 2015, Neurocomputing.

[60]  Rabab Kreidieh Ward,et al.  Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[61]  Aiping Li,et al.  Combining Deep Learning with Information Retrieval for Question Answering , 2016, NLPCC/ICCPOL.