Discovering patterns in time-varying graphs: a triclustering approach

This paper introduces a novel technique to track structures in time varying graphs. The method uses a maximum a posteriori approach for adjusting a three-dimensional co-clustering of the source vertices, the destination vertices and the time, to the data under study, in a way that does not require any hyper-parameter tuning. The three dimensions are simultaneously segmented in order to build clusters of source vertices, destination vertices and time segments where the edge distributions across clusters of vertices follow the same evolution over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make any a priori quantization. Experiments conducted on artificial data illustrate the good behavior of the technique, and a study of a real-life data set shows the potential of the proposed approach for exploratory data analysis.

[1]  S. Boorman,et al.  Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions , 1976, American Journal of Sociology.

[2]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[3]  A. Barabasi,et al.  Quantifying social group evolution , 2007, Nature.

[4]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[5]  Nicola Santoro,et al.  Time-varying graphs and dynamic networks , 2010, Int. J. Parallel Emergent Distributed Syst..

[6]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[7]  Ran El-Yaniv,et al.  Multi-way distributional clustering via pairwise interactions , 2005, ICML.

[8]  X ZhengAlice,et al.  A Survey of Statistical Network Models , 2010 .

[9]  K. Reitz,et al.  Graph and Semigroup Homomorphisms on Networks of Relations , 1983 .

[10]  Fabrice Rossi,et al.  A Triclustering Approach for Time Evolving Graphs , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[11]  Gérard Govaert,et al.  Model-Based Co-clustering for Continuous Data , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[12]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[13]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[14]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[15]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[16]  J. Hintze,et al.  Violin plots : A box plot-density trace synergism , 1998 .

[17]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[18]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[19]  Hans-Hermann Bock,et al.  Two-mode clustering methods: astructuredoverview , 2004, Statistical methods in medical research.

[20]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[21]  Jan Schepers,et al.  Three-mode partitioning , 2006, Comput. Stat. Data Anal..

[22]  E. Xing,et al.  A state-space mixed membership blockmodel for dynamic network tomography , 2008, 0901.0135.

[23]  Farshad Fotouhi,et al.  Co-clustering Documents and Words Using Bipartite Isoperimetric Graph Partitioning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[24]  Philip S. Yu,et al.  GraphScope: parameter-free mining of large time-evolving graphs , 2007, KDD '07.

[25]  A comment on Doreian's regular equivalence in symmetric structures , 1988 .

[26]  Bart Selman,et al.  Tracking evolving communities in large linked networks , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Kevin J. Lang Information Theoretic Comparison of Stochastic Graph Models: Some Experiments , 2009, WAW.

[28]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[29]  M. Boullé,et al.  Data Grid Models for Preparation and Modeling in Supervised Learning Data Grid Models for Preparation and Modeling in Supervised Learning , 2010 .

[30]  Anil K. Jain,et al.  Classification of text documents , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[31]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[32]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[33]  Mohammed J. Zaki,et al.  TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data , 2005, SIGMOD '05.

[34]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[35]  Pierre Hansen,et al.  Variable neighborhood search: Principles and applications , 1998, Eur. J. Oper. Res..

[36]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[37]  S. Nadel The Theory of Social Structure , 1957 .

[38]  P. Grünwald The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .