A semantic modeling method for social network short text based on spatial and temporal characteristics

Abstract Given the social network short text native sparsity, semantic inference becomes an infeasible task for conventional topic models. By exploiting the spatial and temporal characteristics of social network data, we propose a social network short text semantic modeling method, named by Spatial and Temporal Topic Model (STTM). To further overcome short text sparsity, STTM leverages co-occurrence word–word pair to reduce the sparsity problem, and moreover, it incorporates time information into the process of topics modeling in order to generate topics with higher quality. Experimental results over four real social media datasets verify the effectiveness of STTM.

[1]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[2]  Abdolreza Abhari,et al.  Cluster-discovery of Twitter messages for event detection and trending , 2015, J. Comput. Sci..

[3]  Baogang Wei,et al.  Short Text Understanding by Leveraging Knowledge into Topic Model , 2015, NAACL.

[4]  Xiaoming Zhang,et al.  Search engine reinforced semi-supervised classification and graph-based summarization of microblogs , 2015, Neurocomputing.

[5]  Peng Wang,et al.  Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification , 2016, Neurocomputing.

[6]  Christian S. Jensen,et al.  Efficient Online Summarization of Large-Scale Dynamic Networks , 2016, IEEE Transactions on Knowledge and Data Engineering.

[7]  Eric P. Xing,et al.  Sparse Topical Coding , 2011, UAI.

[8]  Jianxin Li,et al.  Personalized Influential Topic Search via Social Network Summarization , 2016, IEEE Trans. Knowl. Data Eng..

[9]  Kai Chen,et al.  Cost-Effective Online Trending Topic Detection and Popularity Prediction in Microblogging , 2016, ACM Trans. Inf. Syst..

[10]  Yong Tang,et al.  Learning to rank with document ranks and scores , 2011, Knowl. Based Syst..

[11]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[12]  Xiaohui Yan,et al.  A biterm topic model for short texts , 2013, WWW.

[13]  Yuan Zuo,et al.  Word network topic model: a simple but general solution for short and imbalanced texts , 2014, Knowledge and Information Systems.

[14]  Ming-Syan Chen,et al.  IncreSTS: Towards Real-Time Incremental Short Text Summarization on Comment Streams from Social Network Services , 2015, IEEE Transactions on Knowledge and Data Engineering.

[15]  Thomas L. Griffiths,et al.  Learning author-topic models from text corpora , 2010, TOIS.

[16]  Xindong Wu,et al.  Big Search in Cyberspace , 2017, IEEE Transactions on Knowledge and Data Engineering.

[17]  Philip S. Yu,et al.  A topic model for co-occurring normal documents and short texts , 2018, World Wide Web.

[18]  Kun Yang,et al.  Dynamic non-parametric joint sentiment topic mixture model , 2015, Knowl. Based Syst..

[19]  Lei Chen,et al.  Event detection over twitter social media streams , 2013, The VLDB Journal.

[20]  Anísio Lacerda,et al.  A general framework to expand short text for topic modeling , 2017, Inf. Sci..

[21]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[22]  Haixun Wang,et al.  Understand Short Texts by Harvesting and Analyzing Semantic Knowledge , 2017, IEEE Transactions on Knowledge and Data Engineering.

[23]  Zi Huang,et al.  What are Popular: Exploring Twitter Features for Event Detection, Tracking and Visualization , 2015, ACM Multimedia.

[24]  Qiaozhu Mei,et al.  Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis , 2014, ICML.

[25]  Francesco Buccafurri,et al.  A model to support design and development of multiple-social-network applications , 2016, Inf. Sci..

[26]  Jiafeng Guo,et al.  BTM: Topic Modeling over Short Texts , 2014, IEEE Transactions on Knowledge and Data Engineering.

[27]  Jingkuan Song,et al.  Real-time social media retrieval with spatial, temporal and social constraints , 2017, Neurocomputing.

[28]  Tao Chen,et al.  VELDA: Relating an Image Tweet's Text and Images , 2015, AAAI.

[29]  Hui Xiong,et al.  Topic Modeling of Short Texts: A Pseudo-Document View , 2016, KDD.

[30]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[31]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[32]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[33]  Christopher M. Danforth,et al.  Sifting robotic from organic text: A natural language approach for detecting automation on Twitter , 2015, J. Comput. Sci..

[34]  Haixun Wang,et al.  Short text understanding through lexical-semantic analysis , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[35]  Sinno Jialin Pan,et al.  Short and Sparse Text Topic Modeling via Self-Aggregation , 2015, IJCAI.

[36]  Pengfei Wang,et al.  An algorithm for event detection based on social media data , 2017, Neurocomputing.

[37]  Jianwu Dang,et al.  Twitter summarization with social-temporal context , 2016, World Wide Web.

[38]  M. de Rijke,et al.  Explainable User Clustering in Short Text Streams , 2016, SIGIR.

[39]  Alexander J. Smola,et al.  Reducing the sampling complexity of topic models , 2014, KDD.

[40]  Changjun Hu,et al.  Predicting the popularity of viral topics based on time series forecasting , 2016, Neurocomputing.

[41]  Xiaofeng Meng,et al.  Query Understanding through Knowledge-Based Conceptualization , 2015, IJCAI.

[42]  Tao Qin,et al.  Feature selection for ranking , 2007, SIGIR.

[43]  Peng Wang,et al.  Self-Taught Convolutional Neural Networks for Short Text Clustering , 2017, Neural Networks.

[44]  Fangzhao Wu,et al.  Microblog sentiment classification with heterogeneous sentiment knowledge , 2016, Inf. Sci..

[45]  Xiuzhen Zhang,et al.  A probabilistic method for emerging topic tracking in Microblog stream , 2016, World Wide Web.

[46]  Chuan Zhou,et al.  Big social network influence maximization via recursively estimating influence spread , 2016, Knowl. Based Syst..

[47]  Xiaoming Zhang,et al.  Event detection and popularity prediction in microblogging , 2015, Neurocomputing.

[48]  Jiawei Han,et al.  Modeling hidden topics on document manifold , 2008, CIKM '08.