A Novel Technique of Using Coupled Matrix and Greedy Coordinate Descent for Multi-view Data Representation

The challenge of clustering multi-view data is to learn all latent features embedded in multiple views accurately and efficiently. Existing Non-negative matrix factorization based multi-view methods learn the latent features embedded in each view independently before building the consensus matrix. Hence, they become computationally expensive and suffer from poor accuracy. We propose to formulate and solve the multi-view data representation by using Coupled Matrix Factorization (CMF) where the latent structure of data will be learned directly from multiple views. The similarity information of data samples, computed from all views, is included into the CMF process leading to a unified framework that is able to exploit all available information and return an accurate and meaningful clustering solution. We present a variable selection based Greedy Coordinate Descent algorithm to solve the formulated CMF to improve the computational efficiency. Experiments with several datasets and several state-of-the-art benchmarks show the effectiveness of the proposed model.

[1]  Joydeep Ghosh,et al.  Under Consideration for Publication in Knowledge and Information Systems Generative Model-based Document Clustering: a Comparative Study , 2003 .

[2]  Anna Goldenberg,et al.  EquiNMF: Graph Regularized Multiview Nonnegative Matrix Factorization , 2014, ArXiv.

[3]  Chun Chen,et al.  Relational co-clustering via manifold ensemble learning , 2012, CIKM.

[4]  Richi Nayak,et al.  Robust clustering of multi-type relational data via a heterogeneous manifold ensemble , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[5]  Xiao Wang,et al.  Multi-Component Nonnegative Matrix Factorization , 2017, IJCAI.

[6]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[8]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[9]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[11]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[12]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[13]  Jiebo Luo,et al.  Multi-type Co-clustering of General Heterogeneous Information Networks via Nonnegative Matrix Tri-Factorization , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[14]  Rasmus Bro,et al.  Data Fusion in Metabolomics Using Coupled Matrix and Tensor Factorizations , 2015, Proceedings of the IEEE.

[15]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[16]  Hongchuan Yu,et al.  Diverse Non-Negative Matrix Factorization for Multiview Data Representation , 2018, IEEE Transactions on Cybernetics.

[17]  Andrzej Cichocki,et al.  Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[18]  Richi Nayak,et al.  Learning Association Relationship and Accurate Geometric Structures for Multi-Type Relational Data , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[19]  Hong Yu,et al.  Multi-view clustering via multi-manifold regularized non-negative matrix factorization , 2017, Neural Networks.

[20]  Xiao Wang,et al.  Adaptive Multi-view Semi-supervised Nonnegative Matrix Factorization , 2016, ICONIP.

[21]  Inderjit S. Dhillon,et al.  Fast coordinate descent methods with variable selection for non-negative matrix factorization , 2011, KDD.

[22]  Shiliang Sun,et al.  Multi-view learning overview: Recent progress and new challenges , 2017, Inf. Fusion.

[23]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[24]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[25]  Rasmus Bro,et al.  Coupled Matrix Factorization with Sparse Factors to Identify Potential Biomarkers in Metabolomics , 2012, Int. J. Knowl. Discov. Bioinform..

[26]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[27]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.