Rank-one partitioning: formalization, illustrative examples, and a new cluster enhancing strategy

In this paper, we introduce and formalize a rank-one partitioning learning paradigm that unifies partitioning methods that proceed by summarizing a data set using a single vector that is further used to derive the final clustering partition. Using this unification as a starting point, we propose a novel algorithmic solution for the partitioning problem based on rank-one matrix factorization and denoising of piecewise constant signals. Finally, we propose an empirical demonstration of our findings and demonstrate the robustness of the proposed denoising step. We believe that our work provides a new point of view for several unsupervised learning techniques that helps to gain a deeper understanding about the general mechanisms of data partitioning.

[1]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[2]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[3]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[4]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ievgen Redko,et al.  Co-clustering through Optimal Transport , 2017, ICML.

[6]  J. Daudin,et al.  Classification and estimation in the Stochastic Block Model based on the empirical degrees , 2011, 1110.6517.

[7]  Max A. Little,et al.  Generalized methods and solvers for noise removal from piecewise constant signals. I. Background theory , 2011, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[8]  Andri Mirzal,et al.  Clustering and Latent Semantic Indexing Aspects of the Nonnegative Matrix Factorization , 2011, ArXiv.

[9]  G. Winkler,et al.  Complexity Penalized M-Estimation , 2008 .

[10]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[11]  Philip A. Knight,et al.  The Sinkhorn-Knopp Algorithm: Convergence and Applications , 2008, SIAM J. Matrix Anal. Appl..

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Laurent Condat,et al.  A Direct Algorithm for 1-D Total Variation Denoising , 2013, IEEE Signal Processing Letters.

[14]  Max A. Little,et al.  Generalized methods and solvers for noise removal from piecewise constant signals. II. New methods , 2010, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[15]  Andreas Weinmann,et al.  The L1-Potts Functional for Robust Jump-Sparse Reconstruction , 2012, SIAM J. Numer. Anal..

[16]  Sylvain Meignen,et al.  Nonlinear cell-average multiscale signal representations: Application to signal denoising , 2012, Signal Process..

[17]  Chris H. Q. Ding,et al.  A spectral method to separate disconnected and nearly-disconnected web graph components , 2001, KDD '01.

[18]  Jianbo Shi,et al.  Learning Segmentation by Random Walks , 2000, NIPS.

[19]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[20]  Arthur Cayley,et al.  The Collected Mathematical Papers: On Monge's “Mémoire sur la théorie des déblais et des remblais” , 2009 .

[21]  Michèle Basseville,et al.  Detection of abrupt changes: theory and application , 1993 .

[22]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[23]  Marina Meila,et al.  Comparing subspace clusterings , 2006, IEEE Transactions on Knowledge and Data Engineering.

[24]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[25]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[26]  Gabriel Peyré,et al.  Gromov-Wasserstein Averaging of Kernel and Distance Matrices , 2016, ICML.

[27]  Antoine Channarond,et al.  Fast and consistent algorithm for the latent block model , 2016, Comput. Stat..

[28]  Richard Sinkhorn,et al.  Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[29]  G. Winkler,et al.  Complexity Penalised M-Estimation: Fast Computation , 2005 .

[30]  Iain S. Duff,et al.  Uncovering Hidden Block Structure for Clustering , 2019, ECML/PKDD.

[31]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[32]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[33]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[34]  L. Kantorovich On the Translocation of Masses , 2006 .