Jointly clustering rows and columns of binary matrices: algorithms and trade-offs

In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we study three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade offs: one can gradually reduce the computational complexity when increasingly more observations are available.

[1]  L. Wasserman,et al.  Statistical and computational tradeoffs in biclustering , 2011 .

[2]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[3]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[5]  Ewout van den Berg,et al.  1-Bit Matrix Completion , 2012, ArXiv.

[6]  Alexandre Proutière,et al.  Community Detection via Random and Adaptive Sampling , 2014, COLT.

[7]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[8]  Yudong Chen,et al.  Incoherence-Optimal Matrix Completion , 2013, IEEE Transactions on Information Theory.

[9]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[10]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[11]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[12]  E. Arias-Castro,et al.  Community Detection in Random Networks , 2013, 1302.7099.

[13]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[14]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[15]  Michael I. Jordan,et al.  Computational and statistical tradeoffs via convex relaxation , 2012, Proceedings of the National Academy of Sciences.

[16]  Sivaraman Balakrishnan,et al.  Minimax Localization of Structural Information in Large Noisy Matrices , 2011, NIPS.

[17]  Jean Bourgain,et al.  On the singularity probability of discrete random matrices , 2009, 0905.0461.

[18]  Onkar Dabeer,et al.  Analysis of a Collaborative Filter Based on Popularity Amongst Neighbors , 2012, IEEE Transactions on Information Theory.

[19]  Ali Jalali,et al.  Low-Rank Matrix Recovery From Errors and Erasures , 2013, IEEE Transactions on Information Theory.

[20]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[21]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[22]  Van H. Vu,et al.  Spectral norm of random matrices , 2005, STOC '05.

[23]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[24]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[25]  Santosh S. Vempala,et al.  Spectral Algorithms , 2009, Found. Trends Theor. Comput. Sci..

[26]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[27]  Bikash Kumar Dey,et al.  A Channel Coding Perspective of Collaborative Filtering , 2011, IEEE Transactions on Information Theory.

[28]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[29]  Philippe Rigollet,et al.  Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[30]  Panos M. Pardalos,et al.  Biclustering in data mining , 2008, Comput. Oper. Res..

[31]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[32]  P. Rigollet,et al.  Optimal detection of sparse principal components in high dimension , 2012, 1202.5070.

[33]  B. Nadler,et al.  Do Semidefinite Relaxations Really Solve Sparse PCA , 2013 .

[34]  Yihong Wu,et al.  Computational Barriers in Minimax Submatrix Detection , 2013, ArXiv.

[35]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[36]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[37]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[38]  Laurent Massoulié,et al.  Distributed user profiling via spectral methods , 2010, SIGMETRICS '10.

[39]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[40]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[41]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..