Binary Multi-View Clustering

Clustering is a long-standing important research problem, however, remains challenging when handling large-scale image data from diverse sources. In this paper, we present a novel Binary Multi-View Clustering (BMVC) framework, which can dexterously manipulate multi-view image data and easily scale to large data. To achieve this goal, we formulate BMVC by two key components: compact collaborative discrete representation learning and binary clustering structure learning, in a joint learning framework. Specifically, BMVC collaboratively encodes the multi-view image descriptors into a compact common binary code space by considering their complementary information; the collaborative binary representations are meanwhile clustered by a binary matrix factorization model, such that the cluster structures are optimized in the Hamming space by pure, extremely fast bit-operations. For efficiency, the code balance constraints are imposed on both binary data representations and cluster centroids. Finally, the resulting optimization problem is solved by an alternating optimization scheme with guaranteed fast convergence. Extensive experiments on four large-scale multi-view image datasets demonstrate that the proposed method enjoys the significant reduction in both computation and memory footprint, while observing superior (in most cases) or very competitive performance, in comparison with state-of-the-art clustering methods.

[1]  Zi Huang,et al.  A Unified Framework for Discrete Spectral Clustering , 2016, IJCAI.

[2]  Yongdong Zhang,et al.  Multiview Spectral Embedding , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[5]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[6]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[7]  Edward Y. Chang,et al.  Parallel Spectral Clustering in Distributed Systems , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Lin Wu,et al.  Robust Hashing for Multi-View Data: Jointly Learning Low-Rank Kernelized Similarity Consensus and Hash Functions , 2016, Image Vis. Comput..

[9]  Fillia Makedon,et al.  Fast Nonnegative Matrix Tri-Factorization for Large-Scale Data Co-Clustering , 2011, IJCAI.

[10]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Multi-View K-Means Clustering on Big Data , 2022 .

[12]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[13]  Philip S. Yu,et al.  Online multi-view clustering with incomplete views , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[14]  Fei Yang,et al.  Web scale photo hash clustering on a single machine , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Xuelong Li,et al.  Multi-View Clustering and Semi-Supervised Classification with Adaptive Neighbours , 2017, AAAI.

[16]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[17]  Rui Yang,et al.  Discrete Multi-view Hashing for Effective Image Retrieval , 2017, ICMR.

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Yang Yang,et al.  A Fast Optimization Method for General Binary Code Learning , 2016, IEEE Transactions on Image Processing.

[20]  Xuelong Li,et al.  Parameter-Free Auto-Weighted Multiple Graph Learning: A Framework for Multiview Clustering and Semi-Supervised Classification , 2016, IJCAI.

[21]  W. Rudin Principles of mathematical analysis , 1964 .

[22]  Ling Shao,et al.  Projection Bank: From High-Dimensional Data to Medium-Length Binary Codes , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[25]  Heng Tao Shen,et al.  Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[27]  Ling Shao,et al.  Dynamic Multi-View Hashing for Online Image Retrieval , 2017, IJCAI.

[28]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[29]  Greg Hamerly,et al.  Accelerating Lloyd’s Algorithm for k -Means Clustering , 2015 .

[30]  Ling Shao,et al.  Learning Short Binary Codes for Large-scale Image Retrieval , 2017, IEEE Transactions on Image Processing.

[31]  Feiping Nie,et al.  Large-Scale Multi-View Spectral Clustering via Bipartite Graph , 2015, AAAI.

[32]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Ling Shao,et al.  Highly-Economized Multi-View Binary Compression for Scalable Image Clustering , 2018, ECCV.

[34]  François Fleuret,et al.  Fast k-means with accurate bounds , 2016, ICML.

[35]  Xinlei Chen,et al.  Large Scale Spectral Clustering with Landmark-Based Representation , 2011, AAAI.

[36]  Zi Huang,et al.  Discrete Nonnegative Spectral Clustering , 2017, IEEE Transactions on Knowledge and Data Engineering.

[37]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[38]  Ling Shao,et al.  Latent Structure Preserving Hashing , 2016, International Journal of Computer Vision.

[39]  Sanjiv Kumar,et al.  Learning Binary Codes for High-Dimensional Data Using Bilinear Projections , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  C. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and K-means - Spectral Clustering , 2005 .

[41]  Heng Tao Shen,et al.  Classification by Retrieval : Binarizing Data and Classifier , 2017 .

[42]  Wei Liu,et al.  Classification by Retrieval: Binarizing Data and Classifiers , 2017, SIGIR.

[43]  Andreas Krause,et al.  Approximate K-Means++ in Sublinear Time , 2016, AAAI.

[44]  Yue Zhao,et al.  Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup , 2015, ICML.

[45]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[46]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[47]  Ling Shao,et al.  Multiview Alignment Hashing for Efficient Image Search , 2015, IEEE Transactions on Image Processing.

[48]  Weiwei Liu,et al.  Compressed K-Means for Large-Scale Clustering , 2017, AAAI.

[49]  Wei Liu,et al.  Asymmetric Binary Coding for Image Search , 2017, IEEE Transactions on Multimedia.