A Novel Embedding Method for Information Diffusion Prediction in Social Network Big Data

With the increase of social networking websites and the interaction frequency among users, the prediction of information diffusion is required to support effective generalization and efficient inference in the context of social big data era. However, the existing models either rely on expensive probabilistic modeling of information diffusion based on partially known network structures, or discover the implicit structures of diffusion from users’ behaviors without considering the impacts of different diffused contents. To address the issues, in this paper, we propose a novel information-dependent embedding-based diffusion prediction (IEDP) model to map the users in observed diffusion process into a latent embedding space, then the temporal order of users with the timestamps in the cascade can be preserved by the embedding distance of users. Our proposed model further learns the propagation probability of information in the cascade as a function of the relative positions of information-specific user embeddings in the information-dependent subspace. Then, the problem of temporal propagation prediction can be converted into the task of spatial probability learning in the embedding space. Moreover, we present an efficient margin-based optimization algorithm with a fast computation to make the inference of the information diffusion in the latent embedding space. When applying our proposed method to several social network datasets, the experimental results show the effectiveness of our proposed approach for the information diffusion prediction and the efficiency with respect to the inference speed compared with the state-of-the-art methods.

[1]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[2]  Kamesh Munagala,et al.  On the precision of social and information networks , 2013, COSN '13.

[3]  Hai Jin,et al.  Differentially Private Online Learning for Cloud-Based Video Recommendation With Multimedia Big Data in Social Networks , 2015, IEEE Transactions on Multimedia.

[4]  Iman Saleh,et al.  Social-Network-Sourced Big Data Analytics , 2013, IEEE Internet Computing.

[5]  Jure Leskovec,et al.  On the Convexity of Latent Social Network Inference , 2010, NIPS.

[6]  Jon Kleinberg,et al.  Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter , 2011, WWW.

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[9]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[10]  Gang Hua,et al.  Multimedia Big Data Computing , 2015, IEEE Multim..

[11]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[12]  Ludovic Denoyer,et al.  Learning social network embeddings for predicting information diffusion , 2014, WSDM.

[13]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[14]  Sylvain Lamprier,et al.  Representation Learning for Information Diffusion through Social Networks: an Embedded Cascade Model , 2016, WSDM.

[15]  Masahiro Kimura,et al.  Learning Diffusion Probability Based on Node Attributes in Social Networks , 2011, ISMIS.

[16]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[17]  Honggang Wang,et al.  A survey of big data research , 2015, IEEE Network.

[18]  Jure Leskovec,et al.  Modeling Information Diffusion in Implicit Networks , 2010, 2010 IEEE International Conference on Data Mining.

[19]  Zhou Su,et al.  Big data in mobile social networks: a QoE-oriented framework , 2016, IEEE Network.

[20]  Aram Galstyan,et al.  Information-theoretic measures of influence based on content dynamics , 2012, WSDM.

[21]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[22]  Guo Li,et al.  A Big Data Clustering Algorithm for Mitigating the Risk of Customer Churn , 2016, IEEE Transactions on Industrial Informatics.

[23]  Michael R. Lyu,et al.  Mining social networks using heat diffusion processes for marketing candidates selection , 2008, CIKM '08.

[24]  Ludovic Denoyer,et al.  Predicting information diffusion on social networks with partial knowledge , 2012, WWW.

[25]  O. K. Gowrishankar,et al.  Personalized Travel Sequence Recommendation on Multi-Source Big Social Media , 2016, IEEE Transactions on Big Data.

[26]  Bernhard Schölkopf,et al.  Modeling Information Propagation with Survival Theory , 2013, ICML.

[27]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[28]  Gian Antonio Susto,et al.  Supervised Aggregative Feature Extraction for Big Data Time Series Regression , 2016, IEEE Transactions on Industrial Informatics.

[29]  Kilian Q. Weinberger,et al.  Learning a kernel matrix for nonlinear dimensionality reduction , 2004, ICML.

[30]  Mo Chen,et al.  Directed Graph Embedding , 2007, IJCAI.

[31]  Yu Zhao,et al.  Knowledge base completion by learning pairwise-interaction differentiated embeddings , 2015, Data Mining and Knowledge Discovery.

[32]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[33]  Stefano Ermon,et al.  Feature-Enhanced Probabilistic Models for Diffusion Network Inference , 2012, ECML/PKDD.

[34]  Mohammed J. Zaki,et al.  ProfileRank: finding relevant content and influential users based on information diffusion , 2013, SNAKDD '13.