Nonlinear subspace clustering using curvature constrained distances

We proposed a new method to cluster multiple manifolds with the intersection.We defined a new notion of distance between points based on shortest constrained path.We applied our method to simulated and some real datasets and achieved good results. The massive amount of high-dimensional data in science and engineering demands new trends in data analysis. Subspace techniques have shown remarkable success in numerous problems in computer vision and data mining, where the goal is to recover the low-dimensional structure of data in an ambient space. Traditional subspace methods like PCA and ICA assume that the data is coming from a single manifold. However, the data might come from several (possibly intersected) manifolds (surfaces). This has caused the development of new nonlinear techniques to cluster subspaces of high-dimensional data. In this paper, we propose a new algorithm for subspace clustering of data, where the data consists of several possibly intersected manifolds. To this end, we first propose a curvature constraint to find the shortest path between data points and then use it in Isomap for subspace learning. The algorithm chooses several landmark nodes at random and then checks whether there is a curvature constrained path between each landmark node and all other nodes in the neighborhood graph. It builds a binary feature vector for each point where each entry represents the connectivity of that point to a particular landmark. Then the binary feature vectors could be used as a input of conventional clustering algorithms such as hierarchical clustering. The performed experiments on both synthetic and real data sets confirm the performance of our algorithm.

[1]  David J. Kriegman,et al.  Clustering appearances of objects under varying illumination conditions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[2]  Carlo Vercellis,et al.  An effective double-bounded tree-connected Isomap algorithm for microarray data classification , 2012, Pattern Recognit. Lett..

[3]  Allen Y. Yang,et al.  Estimation of Subspace Arrangements with Applications in Modeling and Segmenting Mixed Data , 2008, SIAM Rev..

[4]  Russell A. Epstein,et al.  5/spl plusmn/2 eigenimages suffice: an empirical investigation of low-dimensional lighting models , 1995, Proceedings of the Workshop on Physics-Based Modeling in Computer Vision.

[5]  Robert Pless,et al.  Manifold clustering , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  René Vidal,et al.  A Unified Algebraic Approach to 2-D and 3-D Motion Segmentation and Estimation , 2006, Journal of Mathematical Imaging and Vision.

[7]  Enn Saar,et al.  Statistics of the Galaxy Distribution , 2001 .

[8]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[9]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[10]  Donald Geman,et al.  An Active Testing Model for Tracking Roads in Satellite Images , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[12]  Gérard G. Medioni,et al.  Robust Multiple Manifolds Structure Learning , 2012, ICML 2012.

[13]  Pietro Perona,et al.  Grouping and dimensionality reduction by locally linear embedding , 2001, NIPS.

[14]  Jussi Parkkinen,et al.  Manifold clustering via energy minimization , 2007, ICMLA 2007.

[15]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[16]  Daniel P. Robinson,et al.  Sparse Subspace Clustering with Missing Entries , 2015, ICML.

[17]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[18]  Tieniu Tan,et al.  Similarity based vehicle trajectory clustering and anomaly detection , 2005, IEEE International Conference on Image Processing 2005.

[19]  Meysam Asgari,et al.  Fully automated assessment of the severity of Parkinson's disease from speech , 2015, Comput. Speech Lang..

[20]  Alireza Bayestehtashk,et al.  Parsimonious multivariate copula model for density estimation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Changshui Zhang,et al.  Exploring the structure of supervised data by Discriminant Isometric Mapping , 2005, Pattern Recognit..

[22]  Achi Brandt,et al.  Fast multiscale clustering and manifold identification , 2006, Pattern Recognit..

[23]  Amir Babaeian,et al.  Multiple Manifold Clustering Using Curvature Constrained Path , 2015, PloS one.

[24]  Aristides Gionis,et al.  Dimension induced clustering , 2005, KDD '05.

[25]  Guangliang Chen,et al.  Spectral Curvature Clustering (SCC) , 2009, International Journal of Computer Vision.

[26]  Alireza Bayestehtashk,et al.  Target Tracking Using Wavelet Features and RVM Classifier , 2008, 2008 Fourth International Conference on Natural Computation.

[27]  Zhu-Hong You,et al.  Increasing reliability of protein interactome by fast manifold embedding , 2013, Pattern Recognit. Lett..

[28]  Guangliang Chen,et al.  Spectral clustering based on local linear approximations , 2010, 1001.1323.

[29]  Robert D. Nowak,et al.  Multi-Manifold Semi-Supervised Learning , 2009, AISTATS.

[30]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[31]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Ronen Basri,et al.  Lambertian Reflectance and Linear Subspaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Y. Jiang,et al.  Spectral Clustering on Multiple Manifolds , 2011, IEEE Transactions on Neural Networks.

[34]  Amir Babaeian,et al.  Mean shift-based object tracking with multiple features , 2009, 2009 41st Southeastern Symposium on System Theory.

[35]  Joshua B. Tenenbaum,et al.  Global Versus Local Methods in Nonlinear Dimensionality Reduction , 2002, NIPS.

[36]  Ulrike von Luxburg,et al.  Optimal construction of k-nearest-neighbor graphs for identifying noisy clusters , 2009, Theoretical Computer Science.

[37]  Andrea Cavallaro,et al.  Video event segmentation and visualisation in non-linear subspace , 2009, Pattern Recognit. Lett..

[38]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  A. Sayadiyan,et al.  Pattern Classiffication using SVM with GMM Data Selection Training Methode , 2007, 2007 IEEE International Conference on Signal Processing and Communications.

[40]  Sidan Du,et al.  A multi-manifold semi-supervised Gaussian mixture model for pattern classification , 2013, Pattern Recognit. Lett..