Regular Decomposition of Multivariate Time Series and Other Matrices

We describe and illustrate a novel algorithm for clustering a large number of time series into few 'regular groups'. Our method is inspired by the famous Szemeredi's Regularity Lemma SRL in graph theory. SRL suggests that large graphs and matrices can be naturally 'compressed' by partitioning elements in a small number of sets. These sets and the patterns of relations between them present a kind of structure of large objects while the more detailed structure is random-like. We develop a maximum likelihood method for finding such 'regular structures' and applied it to the case of smart meter data of households. The resulting structure appears as more informative than a structure found by k-means. The algorithm scales well with data size and the structure itself becomes more apparent with bigger data size. Therefore, our method could be useful in a broader context of emerging big data.

[1]  P. Grünwald The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .

[2]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[3]  E. Szemerédi Regular Partitions of Graphs , 1975 .

[4]  Marcello Pelillo,et al.  Szemerédi's Regularity Lemma and Its Applications to Pairwise Clustering and Segmentation , 2007, EMMCVPR.

[5]  Hannu Reittu,et al.  Szemerédi-type clustering of peer-to-peer streaming system , 2011, Cnet@ITC.

[6]  Gábor E. Tusnády,et al.  Reconstructing Cortical Networks: Case of Directed Graphs with High Level of Reciprocity , 2008 .

[7]  Fang Zhou,et al.  Compression of weighted graphs , 2011, KDD.

[8]  Béla Bollobás,et al.  Handbook of large-scale random networks , 2008 .

[9]  Tiago P. Peixoto The entropy of stochastic blockmodel ensembles , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[11]  Endre Szemerédi,et al.  A Practical Regularity Partitioning Algorithm and its Applications in Clustering , 2012, ArXiv.

[12]  Alan M. Frieze,et al.  The regularity lemma and approximation schemes for dense problems , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[13]  M. Bolla Recognizing linear structure in noisy matrices , 2005 .

[14]  Aaron Clauset,et al.  Adapting the Stochastic Block Model to Edge-Weighted Networks , 2013, ArXiv.

[15]  M. Simonovits,et al.  Szemeredi''s Regularity Lemma and its applications in graph theory , 1995 .

[16]  Marianna Bolla,et al.  Spectral Clustering and Biclustering , 2013 .