Multiple Incomplete Views Clustering via Weighted Nonnegative Matrix Factorization with L2, 1 Regularization

With the advance of technology, data are often with multiple modalities or coming from multiple sources. Multi-view clustering provides a natural way for generating clusters from such data. Although multi-view clustering has been successfully applied in many applications, most of the previous methods assumed the completeness of each view (i.e., each instance appears in all views). However, in real-world applications, it is often the case that a number of views are available for learning but none of them is complete. The incompleteness of all the views and the number of available views make it difficult to integrate all the incomplete views and get a better clustering solution. In this paper, we propose MIC (Multi-Incomplete-view Clustering), an algorithm based on weighted nonnegative matrix factorization with L2,1 regularization. The proposed MIC works by learning the latent feature matrices for all the views and generating a consensus matrix so that the difference between each view and the consensus is minimized. MIC has several advantages comparing with other existing methods. First, MIC incorporates weighted nonnegative matrix factorization, which handles the missing instances in each incomplete view. Second, MIC uses a co-regularized approach, which pushes the learned latent feature matrices of all the views towards a common consensus. By regularizing the disagreement between the latent feature matrices and the consensus, MIC can be easily extended to more than two incomplete views. Third, MIC incorporates L2,1 regularization into the weighted nonnegative matrix factorization, which makes it robust to noises and outliers. Forth, an iterative optimization framework is used in MIC, which is scalable and proved to converge. Experiments on real datasets demonstrate the advantages of MIC.

[1]  Derek Greene,et al.  A Matrix Factorization Approach for Integrating Multiple Data Views , 2009, ECML/PKDD.

[2]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[3]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[4]  Hans-Peter Kriegel,et al.  MUSE: Multi-Represented Similarity Estimation , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[5]  Chris H. Q. Ding,et al.  Robust tensor factorization using R1 norm , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[7]  Xindong Wu,et al.  Feature Selection by Joint Graph Sparse Coding , 2013, SDM.

[8]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[9]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Philip S. Yu,et al.  A General Model for Multiple View Unsupervised Learning , 2008, SDM.

[11]  Haesun Park,et al.  Sparse Nonnegative Matrix Factorization for Clustering , 2008 .

[12]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[13]  Chris H. Q. Ding,et al.  R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization , 2006, ICML.

[14]  Wei Tang,et al.  Clustering with Multiple Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[15]  Stéphane Marchand-Maillet,et al.  Multiview clustering: a late fusion approach using latent models , 2009, SIGIR.

[16]  Seungjin Choi,et al.  Weighted nonnegative matrix factorization , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[18]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[19]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[20]  Chris H. Q. Ding,et al.  Weighted Feature Subset Non-negative Matrix Factorization and Its Applications to Document Understanding , 2010, ICDM.

[21]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[22]  Xuan Li,et al.  Robust Nonnegative Matrix Factorization via Half-Quadratic Minimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[23]  Yuhong Guo,et al.  Convex Subspace Representation Learning from Multi-View Data , 2013, AAAI.

[24]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[25]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[26]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[27]  Philip S. Yu,et al.  Clustering on Multiple Incomplete Datasets via Collective Kernel Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[28]  Cédric Févotte,et al.  Majorization-minimization algorithm for smooth Itakura-Saito nonnegative matrix factorization , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Wei Cheng,et al.  Flexible and robust co-regularized multi-domain graph clustering , 2013, KDD.

[30]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[31]  Chris H. Q. Ding,et al.  Collaborative Filtering: Weighted Nonnegative Matrix Factorization Incorporating User and Item Graphs , 2010, SDM.

[32]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[33]  Chris H. Q. Ding,et al.  Robust nonnegative matrix factorization using L21-norm , 2011, CIKM '11.

[34]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[35]  Martha White,et al.  Convex Sparse Coding, Subspace Learning, and Semi-Supervised Extensions , 2011, AAAI.