V3H: Incomplete Multi-view Clustering via View Variation and View Heredity

Real data often appear in the form of multiple incomplete views, and incomplete multi-view clustering is an effective method to integrate these incomplete views. Previous methods only learn the consistent information between different views and ignore the unique information of each view, which limits their clustering performance and generalizations. To overcome this limitation, we propose a novel View Variation and View Heredity approach (V 3 H). Inspired by the variation and the heredity in genetics, V 3 H first decomposes each subspace into a variation matrix for the corresponding view and a heredity matrix for all the views to represent the unique information and the consistent information respectively. Then, by aligning different views based on their cluster indicator matrices, V3H integrates the unique information from different views to improve the clustering performance. Finally, with the help of the adjustable low-rank representation based on the heredity matrix, V3H recovers the underlying true data structure to reduce the influence of the large incompleteness. More importantly, V3H presents possibly the first work to introduce genetics to clustering algorithms for learning simultaneously the consistent information and the unique information from incomplete multi-view data. Extensive experimental results on fifteen benchmark datasets validate its superiority over other state-of-the-arts.

[1]  Philip S. Yu,et al.  Multiple Incomplete Views Clustering via Weighted Nonnegative Matrix Factorization with L2, 1 Regularization , 2015, ECML/PKDD.

[2]  Zhenni Li,et al.  Uniform Distribution Non-Negative Matrix Factorization for Multiview Clustering , 2020, IEEE Transactions on Cybernetics.

[3]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[4]  Luigi Luca Cavalli-Sforza,et al.  The genetics of human populations. , 1972, Scientific American.

[5]  Chang Tang,et al.  Efficient and Effective Regularized Incomplete Multi-View Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  T. P. Dinh,et al.  Convex analysis approach to d.c. programming: Theory, Algorithm and Applications , 1997 .

[7]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[8]  Wei Zhang,et al.  Consistent and Specific Multi-View Subspace Clustering , 2018, AAAI.

[9]  B. Weir,et al.  Analysis of cytoplasmic and maternal effects I. A genetic model for diploid plant seeds and animals , 1994, Theoretical and Applied Genetics.

[10]  Jinbo Bi,et al.  Multi-view cluster analysis with incomplete data to understand treatment effects , 2019, Inf. Sci..

[11]  G. Remuzzi,et al.  COVID-19 and Italy: what next? , 2020, The Lancet.

[12]  Hong Liu,et al.  Incomplete Multiview Spectral Clustering With Adaptive Graph Learning , 2020, IEEE Transactions on Cybernetics.

[13]  Shiliang Sun,et al.  A Survey on Multiview Clustering , 2017, IEEE Transactions on Artificial Intelligence.

[14]  Donald Geman,et al.  Nonlinear image recovery with half-quadratic regularization , 1995, IEEE Trans. Image Process..

[15]  Majid Mirmehdi,et al.  Experiments on High Resolution Images Towards Outdoor Scene Classification , 2002 .

[16]  Yun Fu,et al.  Incomplete Multi-Modal Visual Data Grouping , 2016, IJCAI.

[17]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  L. Feuk,et al.  Structural variation in the human genome , 2006, Nature Reviews Genetics.

[19]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[20]  Xinwang Liu,et al.  Multiple Kernel Clustering With Neighbor-Kernel Subspace Segmentation , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Shiliang Sun,et al.  Semi-supervised multi-view maximum entropy discrimination with expectation Laplacian regularization , 2019, Inf. Fusion.

[22]  Hong Liu,et al.  Unified Embedding Alignment with Missing Views Inferring for Incomplete Multi-View Clustering , 2019, AAAI.

[23]  Zhao Kang,et al.  Robust PCA Via Nonconvex Rank Approximation , 2015, 2015 IEEE International Conference on Data Mining.

[24]  Shiliang Sun,et al.  Multi-View Maximum Entropy Discrimination , 2013, IJCAI.

[25]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[27]  Constantine Caramanis,et al.  Robust PCA via Outlier Pursuit , 2010, IEEE Transactions on Information Theory.

[28]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Multi-View K-Means Clustering on Big Data , 2022 .

[29]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Yan Bai,et al.  Presumed Asymptomatic Carrier Transmission of COVID-19. , 2020, JAMA.

[31]  H. Rothan,et al.  The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak , 2020, Journal of Autoimmunity.

[32]  Songcan Chen,et al.  Doubly Aligned Incomplete Multi-view Clustering , 2018, IJCAI.

[33]  L. Ammann Robust Principal Components , 1989 .

[34]  R. Lu,et al.  Detection of SARS-CoV-2 in Different Types of Clinical Specimens. , 2020, JAMA.

[35]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[36]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[37]  Jun Guo,et al.  Anchors Bring Ease: An Embarrassingly Simple Approach to Partial Multi-View Clustering , 2019, AAAI.

[38]  P. Hedrick Genetics of populations , 1983 .

[39]  Chang-Dong Wang,et al.  Multi-View Clustering in Latent Embedding Space , 2020, AAAI.

[40]  Min Kang,et al.  SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients , 2020, The New England journal of medicine.

[41]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Guoqing Chao,et al.  Discriminative K-Means Laplacian Clustering , 2018, Neural Processing Letters.

[43]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[44]  Yuan Luo,et al.  Recent Advances in Supervised Dimension Reduction: A Survey , 2019, Mach. Learn. Knowl. Extr..

[45]  T. Dobzhansky Genetics of the Evolutionary Process , 1970 .

[46]  W. G. Hill,et al.  Heritability in the genomics era — concepts and misconceptions , 2008, Nature Reviews Genetics.

[47]  Qiang Zhou,et al.  Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 , 2020, Science.

[48]  Yuan Zhao,et al.  Supervised Nonnegative Matrix Factorization to Predict ICU Mortality Risk , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[49]  Han Zhang,et al.  Multiview Clustering: A Scalable and Parameter-Free Bipartite Graph Fusion Method , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[51]  René Vidal,et al.  Subspace Clustering , 2011, IEEE Signal Processing Magazine.

[52]  Shiliang Sun,et al.  Alternative Multiview Maximum Entropy Discrimination , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[53]  Haibo He,et al.  A Ranked Subspace Learning Method for Gene Expression Data Classification , 2007, IC-AI.

[54]  Cui Chuan-zhi Notice of RetractionGenetic algorithm principle and the application in oilfield development , 2010, 2010 2nd International Conference on Computer Engineering and Technology.

[55]  Feiping Nie,et al.  Auto-weighted multi-view co-clustering via fast matrix factorization , 2020, Pattern Recognit..

[56]  Hao Wang,et al.  GMC: Graph-Based Multi-View Clustering , 2020, IEEE Transactions on Knowledge and Data Engineering.

[57]  Philip S. Yu,et al.  Online multi-view clustering with incomplete views , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[58]  Feiping Nie,et al.  Multi-View K-Means Clustering With Adaptive Sparse Memberships and Weight Allocation , 2022, IEEE Transactions on Knowledge and Data Engineering.

[59]  Xin Zheng,et al.  Partial Multi-view Subspace Clustering , 2018, ACM Multimedia.

[60]  E. Mayr Populations, Species, and Evolution, An Abridgment of Animal Species and Evolution , 1970 .

[61]  Jiawei Han,et al.  Sparse Projections over Graph , 2008, AAAI.

[62]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[63]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[64]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Shiliang Sun,et al.  Multi-kernel maximum entropy discrimination for multi-view learning , 2016, Intell. Data Anal..

[66]  Shiliang Sun,et al.  Consensus and complementarity based maximum entropy discrimination for multi-view classification , 2016, Inf. Sci..

[67]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization with Application to Image Denoising , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[68]  R. Willham THE COVARIANCE BETWEEN RELATIVES FOR CHARACTERS COMPOSED OF COMPONENTS CONTRIBUTED BY RELATED INDIVIDUALS1 , 1963 .

[69]  A. Walls,et al.  Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein , 2020, Cell.