iCmSC: Incomplete Cross-Modal Subspace Clustering

Cross-modal clustering aims to cluster the high-similar cross-modal data into one group while separating the dissimilar data. Despite the promising cross-modal methods have developed in recent years, existing state-of-the-arts cannot effectively capture the correlations between cross-modal data when encountering with incomplete cross-modal data, which can gravely degrade the clustering performance. To well tackle the above scenario, we propose a novel incomplete cross-modal clustering method that integrates canonical correlation analysis and exclusive representation, named incomplete Cross-modal Subspace Clustering (i.e., iCmSC). To learn a consistent subspace representation among incomplete cross-modal data, we maximize the intrinsic correlations among different modalities by deep canonical correlation analysis (DCCA), while an exclusive self-expression layer is proposed after the output layers of DCCA. We exploit a $\ell _{1,2}$ -norm regularization in the learned subspace to make the learned representation more discriminative, which makes samples between different clusters mutually exclusive and samples among the same cluster attractive to each other. Meanwhile, the decoding networks are employed to reconstruct the feature representation, and further preserve the structural information among the original cross-modal data. To the end, we demonstrate the effectiveness of the proposed iCmSC via extensive experiments, which can justify that iCmSC achieves consistently large improvement compared with the state-of-the-arts.

[1]  Gan Sun,et al.  Continual Multiview Task Learning via Deep Matrix Factorization , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Xiangyang Xue,et al.  Cross-Modal Image Clustering via Canonical Correlation Analysis , 2015, AAAI.

[3]  Piyush Rai,et al.  Multiview Clustering with Incomplete Views , 2010 .

[4]  Shotaro Akaho,et al.  A kernel method for canonical correlation analysis , 2006, ArXiv.

[5]  Hong Liu,et al.  Incomplete Multiview Spectral Clustering With Adaptive Graph Learning , 2020, IEEE Transactions on Cybernetics.

[6]  Qinghua Hu,et al.  Generalized Latent Multi-View Subspace Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Xuelong Li,et al.  Robust Subspace Clustering by Cauchy Loss Function , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Xinbo Gao,et al.  Triplet-Based Deep Hashing Network for Cross-Modal Retrieval , 2018, IEEE Transactions on Image Processing.

[9]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[10]  Yueting Zhuang,et al.  Cross-modal correlation learning for clustering on image-audio dataset , 2007, ACM Multimedia.

[11]  Rong Wang,et al.  Fast spectral clustering learning with hierarchical bipartite graph for large-scale data , 2020, Pattern Recognit. Lett..

[12]  Bo Zhao,et al.  Multi-View Image Generation from a Single-View , 2017, ACM Multimedia.

[13]  Tsuyoshi Kato,et al.  Mutual Kernel Matrix Completion , 2017, IEICE Trans. Inf. Syst..

[14]  Hao Wang,et al.  Spectral Perturbation Meets Incomplete Multi-view Data , 2019, IJCAI.

[15]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[16]  Chris H. Q. Ding,et al.  Joint stage recognition and anatomical annotation of drosophila gene expression patterns , 2012, Bioinform..

[17]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[18]  Yun Fu,et al.  Marginalized Multiview Ensemble Clustering , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Liang Wang,et al.  Cross-Modal Subspace Learning via Pairwise Constraints , 2014, IEEE Transactions on Image Processing.

[20]  Lei Wang,et al.  Multiple Kernel k-Means with Incomplete Kernels , 2017, AAAI.

[21]  Qianqian Wang,et al.  Generative Partial Multi-View Clustering With Adaptive Fusion and Cycle Consistency , 2020, IEEE Transactions on Image Processing.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Songcan Chen,et al.  Locality preserving CCA with applications to data visualization and pose estimation , 2007, Image Vis. Comput..

[24]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[25]  Robert P. W. Duin,et al.  Handwritten digit recognition by combined classifiers , 1998, Kybernetika.

[26]  Philip S. Yu,et al.  Clustering on Multiple Incomplete Datasets via Collective Kernel Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[27]  Michael H. Coen,et al.  Cross-Modal Clustering , 2005, AAAI.

[28]  Songcan Chen,et al.  Doubly Aligned Incomplete Multi-view Clustering , 2018, IJCAI.

[29]  Chenping Hou,et al.  Simultaneous Representation Learning and Clustering for Incomplete Multi-view Data , 2019, IJCAI.

[30]  Georgios B. Giannakis,et al.  Sketched Subspace Clustering , 2017, IEEE Transactions on Signal Processing.

[31]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[32]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[33]  Huda Khayrallah,et al.  Deep Generalized Canonical Correlation Analysis , 2017, RepL4NLP@ACL.

[34]  Santanu Chaudhury,et al.  Partial Multi-View Clustering using Graph Regularized NMF , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[35]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[36]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[37]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[38]  Vishal M. Patel,et al.  Deep Multimodal Subspace Clustering Networks , 2018, IEEE Journal of Selected Topics in Signal Processing.

[39]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[40]  Liang Wang,et al.  Incomplete Multi-view Clustering via Subspace Learning , 2015, CIKM.

[41]  Xuelong Li,et al.  Parameter-Free Auto-Weighted Multiple Graph Learning: A Framework for Multiview Clustering and Semi-Supervised Classification , 2016, IJCAI.

[42]  Lei Du,et al.  Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition , 2014, AAAI.

[43]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[44]  Zhaoyang Li,et al.  Deep Adversarial Multiview Clustering Network , 2019 .

[45]  Yun Fu,et al.  Partial Multi-view Clustering via Consistent GAN , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[46]  Huazhu Fu,et al.  CPM-Nets: Cross Partial Multi-View Networks , 2019, NeurIPS.

[47]  Fujiao Ju,et al.  Learning Adaptive Neighborhood Graph on Grassmann Manifolds for Video/Image-Set Subspace Clustering , 2021, IEEE Transactions on Multimedia.

[48]  Nikhil Rasiwasia,et al.  Cluster Canonical Correlation Analysis , 2014, AISTATS.

[49]  Dinggang Shen,et al.  Deep Adversarial Learning for Multi-Modality Missing Data Completion , 2018, KDD.

[50]  Feiping Nie,et al.  A Probabilistic Derivation of LASSO and L12-Norm Feature Selections , 2019, AAAI.

[51]  Philip S. Yu,et al.  Multiple Incomplete Views Clustering via Weighted Nonnegative Matrix Factorization with L2, 1 Regularization , 2015, ECML/PKDD.

[52]  Tao Mei,et al.  Subspace Clustering by Block Diagonal Representation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Yun Fu,et al.  Incomplete Multi-Modal Visual Data Grouping , 2016, IJCAI.

[54]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[55]  Changsheng Xu,et al.  Learning Consistent Feature Representation for Cross-Modal Multimedia Retrieval , 2015, IEEE Transactions on Multimedia.

[56]  Tong Zhang,et al.  Deep Subspace Clustering Networks , 2017, NIPS.