A Co-training Approach for Multi-view Spectral Clustering

We propose a spectral clustering algorithm for the multi-view setting where we have access to multiple views of the data, each of which can be independently used for clustering. Our spectral clustering algorithm has a flavor of co-training, which is already a widely used idea in semi-supervised learning. We work on the assumption that the true underlying clustering would assign a point to the same cluster irrespective of the view. Hence, we constrain our approach to only search for the clusterings that agree across the views. Our algorithm does not have any hyperparameters to set, which is a major advantage in unsupervised learning. We empirically compare with a number of baseline methods on synthetic and real-world datasets to show the efficacy of the proposed algorithm.

[1]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[2]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[4]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[5]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[6]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[7]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[10]  Derek Greene,et al.  Producing Accurate Interpretable Clusters from High-Dimensional Data , 2005, PKDD.

[11]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[12]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[13]  Per-Gunnar Martinsson,et al.  Randomized algorithms for the low-rank approximation of matrices , 2007, Proceedings of the National Academy of Sciences.

[14]  V. D. Sa Spectral Clustering with Two Views , 2007 .

[15]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[16]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[17]  Christoph H. Lampert,et al.  Correlational spectral clustering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Mehryar Mohri,et al.  Learning Non-Linear Combinations of Kernels , 2009, NIPS.

[19]  Wei Tang,et al.  Clustering with Multiple Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[20]  Derek Greene,et al.  A Matrix Factorization Approach for Integrating Multiple Data Views , 2009, ECML/PKDD.

[21]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[22]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.