Tracking Idea Flows between Social Groups

In many applications, ideas that are described by a set of words often flow between different groups. To facilitate users in analyzing the flow, we present a method to model the flow behaviors that aims at identifying the lead-lag relationships between word clusters of different user groups. In particular, an improved Bayesian conditional cointegration based on dynamic time warping is employed to learn links between words in different groups. A tensor-based technique is developed to cluster these linked words into different clusters (ideas) and track the flow of ideas. The main feature of the tensor representation is that we introduce two additional dimensions to represent both time and lead-lag relationships. Experiments on both synthetic and real datasets show that our method is more effective than methods based on traditional clustering techniques and achieves better accuracy. A case study was conducted to demonstrate the usefulness of our method in helping users understand the flow of ideas between different user groups on social media

[1]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[2]  Yihong Gong,et al.  A Two-Level Topic Model Towards Knowledge Discovery from Citation Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3]  Steffen Bickel,et al.  Unsupervised prediction of citation influences , 2007, ICML '07.

[4]  Yifeng Zeng,et al.  Influence Maximization with Novelty Decay in Social Networks , 2014, AAAI.

[5]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[7]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[9]  Fei Wang,et al.  From Micro to Macro: Uncovering and Predicting Information Cascading Process with Behavioral Dynamics , 2015, 2015 IEEE International Conference on Data Mining.

[10]  Takuya Akiba,et al.  Fast and Accurate Influence Maximization on Large Networks with Pruned Monte-Carlo Simulations , 2014, AAAI.

[11]  Carlos Guestrin,et al.  Beyond keyword search: discovering relevant scientific literature , 2011, KDD.

[12]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[13]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14]  Jiawei Han,et al.  Mining topic-level influence in heterogeneous networks , 2010, CIKM.

[15]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[16]  Dan Roth,et al.  Incorporating World Knowledge to Document Clustering via Heterogeneous Information Networks , 2015, KDD.

[17]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[18]  Yihong Gong,et al.  A Topic Model for Linked Documents and Update Rules for its Estimation , 2010, AAAI.

[19]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[20]  Lifeng Sun,et al.  Item-Level Social Influence Prediction with Probabilistic Hybrid Factor Matrix Factorization , 2011, AAAI.

[21]  Dafna Shahaf,et al.  Connecting the dots between news articles , 2010, IJCAI.

[22]  Fangzhao Wu,et al.  OpinionFlow: Visual Analysis of Opinion Diffusion on Social Media , 2014, IEEE Transactions on Visualization and Computer Graphics.

[23]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[24]  Edward Y. Chang,et al.  Parallel Spectral Clustering in Distributed Systems , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Ramesh Nallapati,et al.  Link-PLSA-LDA: A New Unsupervised Model for Topics and Influence of Blogs , 2021, ICWSM.

[26]  Chris Arney Social Physics: How Good Ideas Spread - the Lessons from a New Science , 2014 .

[27]  Thorsten Joachims,et al.  Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases , 2007, KDD '07.

[28]  Tamara G. Kolda,et al.  Higher-order Web link analysis using multilinear algebra , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[29]  Yi Zhao,et al.  Bringing PageRank to the citation analysis , 2008, Inf. Process. Manag..

[30]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[31]  Allan G. Bluman Elementary Statistics: A Step By Step Approach , 1980 .

[32]  Shimei Pan,et al.  TIARA: Interactive, Topic-Based Visual Text Summarization and Analysis , 2012, TIST.

[33]  Jiawei Han,et al.  Text Classification with Heterogeneous Information Network Kernels , 2016, AAAI.

[34]  Zhenyu Liu,et al.  Lead-lag analysis via sparse co-projection in correlated text streams , 2013, CIKM.

[35]  Sean Gerrish,et al.  A Language-based Approach to Measuring Scholarly Impact , 2010, ICML.

[36]  Jure Leskovec,et al.  Information diffusion and external influence in networks , 2012, KDD.

[37]  Philip S. Yu,et al.  Incremental tensor analysis: Theory and applications , 2008, TKDD.

[38]  David Barber,et al.  Bayesian Conditional Cointegration , 2012, ICML.

[39]  Jianwen Zhang,et al.  Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora , 2010, KDD.

[40]  Kun Zhou,et al.  Exploring Topical Lead-Lag across Corpora , 2015, IEEE Transactions on Knowledge and Data Engineering.

[41]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.

[42]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[43]  Sylvie Gibet,et al.  On Recursive Edit Distance Kernels With Application to Time Series Classification , 2010, IEEE Transactions on Neural Networks and Learning Systems.

[44]  Jiawei Han,et al.  KnowSim: A Document Similarity Measure on Structured Heterogeneous Information Networks , 2015, 2015 IEEE International Conference on Data Mining.

[45]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[46]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[47]  Xin Tong,et al.  TextFlow: Towards Better Understanding of Evolving Topics in Text , 2011, IEEE Transactions on Visualization and Computer Graphics.