Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering

Spectral clustering (SC) methods have been successfully applied to many real-world applications. The success of these SC methods is largely based on the manifold assumption, namely, that two nearby data points in the high-density region of a low-dimensional data manifold have the same cluster label. However, such an assumption might not always hold on high-dimensional data. When the data do not exhibit a clear low-dimensional manifold structure (e.g., high-dimensional and sparse data), the clustering performance of SC will be degraded and become even worse than K -means clustering. In this paper, motivated by the observation that the true cluster assignment matrix for high-dimensional data can be always embedded in a linear space spanned by the data, we propose the spectral embedded clustering (SEC) framework, in which a linearity regularization is explicitly added into the objective function of SC methods. More importantly, the proposed SEC framework can naturally deal with out-of-sample data. We also present a new Laplacian matrix constructed from a local regression of each pattern and incorporate it into our SEC framework to capture both local and global discriminative information for clustering. Comprehensive experiments on eight real-world high-dimensional datasets demonstrate the effectiveness and advantages of our SEC framework over existing SC methods and K-means-based clustering methods. Our SEC framework significantly outperforms SC using the Nyström algorithm on unseen data.

[1]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[2]  YeJieping Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005 .

[3]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[5]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[6]  Ivor W. Tsang,et al.  Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[8]  Chris H. Q. Ding,et al.  Adaptive dimension reduction for clustering high dimensional data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[9]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[10]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[13]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[14]  Jieping Ye,et al.  Least squares linear discriminant analysis , 2007, ICML '07.

[15]  Jieping Ye,et al.  Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005, J. Mach. Learn. Res..

[16]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  P. Deb Finite Mixture Models , 2008 .

[18]  Tao Li,et al.  Document clustering via adaptive subspace iteration , 2004, SIGIR '04.

[19]  Bernhard Schölkopf,et al.  A Local Learning Approach for Clustering , 2006, NIPS.

[20]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[21]  Jan Larsen,et al.  Clustering via kernel decomposition , 2006, IEEE Transactions on Neural Networks.

[22]  James T. Kwok,et al.  Simplifying Mixture Models Through Function Approximation , 2006, IEEE Transactions on Neural Networks.

[23]  Fei Wang,et al.  Clustering with Local and Global Regularization , 2007, IEEE Transactions on Knowledge and Data Engineering.

[24]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[25]  A. Martínez,et al.  The AR face databasae , 1998 .

[26]  Paolo Frasconi,et al.  New results on error correcting output codes of kernel machines , 2004, IEEE Transactions on Neural Networks.

[27]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[28]  Ivor W. Tsang,et al.  Dynamic vehicle routing with stochastic requests , 2003, IJCAI 2003.

[29]  Johan A. K. Suykens,et al.  Multiway Spectral Clustering with Out-of-Sample Extensions through Weighted Kernel PCA , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Ivor W. Tsang,et al.  Incorporating the Loss Function Into Discriminative Clustering of Structured Outputs , 2010, IEEE Transactions on Neural Networks.

[31]  Erzsébet Merényi,et al.  Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps , 2009, IEEE Transactions on Neural Networks.

[32]  Hakan Cevikalp,et al.  Discriminative common vectors for face recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  J. S. Urban Hjorth,et al.  Computer Intensive Statistical Methods: Validation, Model Selection, and Bootstrap , 1993 .

[34]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[35]  Ivor W. Tsang,et al.  Tighter and Convex Maximum Margin Clustering , 2009, AISTATS.

[36]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[37]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[38]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[39]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Yi Yang,et al.  Image Clustering Using Local Discriminant Models and Global Integration , 2010, IEEE Transactions on Image Processing.

[41]  Jieping Ye,et al.  Adaptive Distance Metric Learning for Clustering , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[44]  Yi Yang,et al.  Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[45]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Ivor W. Tsang,et al.  Maximum Margin Clustering Made Practical , 2007, IEEE Transactions on Neural Networks.

[47]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[48]  Jieping Ye,et al.  Discriminative K-means for Clustering , 2007, NIPS.

[49]  Ulrich Eckhardt,et al.  Shape descriptors for non-rigid shapes with a single closed contour , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[50]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[51]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[52]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[53]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[54]  Bernhard Schölkopf,et al.  Transductive Classification via Local Learning Regularization , 2007, AISTATS.

[55]  Juan Manuel Sáez,et al.  Learning Gaussian Mixture Models With Entropy-Based Criteria , 2009, IEEE Transactions on Neural Networks.

[56]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[57]  Takeo Kanade,et al.  Discriminative cluster analysis , 2006, ICML.

[58]  Kadim Tasdemir Graph Based Representations of Density Distribution and Distances for Self-Organizing Maps , 2010, IEEE Transactions on Neural Networks.