Efficient Manifold and Subspace Approximations with Spherelets

Data lying in a high-dimensional ambient space are commonly thought to have a much lower intrinsic dimension. In particular, the data may be concentrated near a lower-dimensional subspace or manifold. There is an immense literature focused on approximating the unknown subspace, and in exploiting such approximations in clustering, data compression, and building of predictive models. Most of the literature relies on approximating subspaces using a locally linear, and potentially multiscale, dictionary. In this article, we propose a simple and general alternative, which instead uses pieces of spheres, or spherelets, to locally approximate the unknown subspace. Building on this idea, we develop a simple and computationally efficient algorithm for subspace learning and clustering. Results relative to state-of-the-art competitors show dramatic gains in ability to accurately approximate the subspace with orders of magnitude fewer components. This leads to substantial gains in data compressibility, few clusters and hence better interpretability, and much lower MSE based on small to moderate sample sizes. Basic theory on approximation accuracy is presented, and the methods are applied to multiple examples.

[1]  C. Fefferman,et al.  Fitting a manifold of large reach to noisy data , 2019, Journal of Topology and Analysis.

[2]  Matti Lassas,et al.  Fitting a Putative Manifold to Noisy Data , 2018, COLT.

[3]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[4]  Alex Rodriguez,et al.  Estimating the intrinsic dimension of datasets by a minimal neighborhood information , 2017, Scientific Reports.

[5]  Wang Wei,et al.  Nonlinear system monitoring with piecewise performed principal component analysis , 2017, 2017 36th Chinese Control Conference (CCC).

[6]  Cl'ement Levrard,et al.  Nonasymptotic rates for manifold, tangent space and curvature estimation , 2017, The Annals of Statistics.

[7]  Mauro Maggioni,et al.  Adaptive Geometric Multiscale Approximations for Intrinsically Low-dimensional Data , 2016, Journal of machine learning research.

[8]  Mauro Maggioni,et al.  Learning adaptive multiscale approximations to data and functions near low-dimensional sets , 2016, 2016 IEEE Information Theory Workshop (ITW).

[9]  David Levin,et al.  Manifold Approximation by Moving Least-Squares Projection (MMLS) , 2016, Constructive Approximation.

[10]  Stanislav Minsker,et al.  Multiscale Dictionary Learning: Non-Asymptotic Bounds and Robustness , 2014, J. Mach. Learn. Res..

[11]  Gilad Lerman,et al.  Spectral Clustering Based on Local PCA , 2013, J. Mach. Learn. Res..

[12]  F. Wright,et al.  CONVERGENCE AND PREDICTION OF PRINCIPAL COMPONENT SCORES IN HIGH-DIMENSIONAL SETTINGS. , 2012, Annals of statistics.

[13]  Miguel Á. Carreira-Perpiñán,et al.  A Denoising View of Matrix Completion , 2011, NIPS.

[14]  Larry A. Wasserman,et al.  Manifold Estimation and Singular Deconvolution Under Hausdorff Loss , 2011, ArXiv.

[15]  Guangliang Chen,et al.  Multiscale geometric and spectral analysis of plane arrangements , 2011, CVPR 2011.

[16]  Larry A. Wasserman,et al.  Minimax Manifold Estimation , 2010, J. Mach. Learn. Res..

[17]  Miguel Á. Carreira-Perpiñán,et al.  Manifold blurring mean shift algorithms for manifold denoising , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Arthur Szlam,et al.  Asymptotic regularity of subdivisions of Euclidean domains by iterated PCA and iterated 2-means , 2009 .

[19]  M. Maggioni,et al.  Estimation of intrinsic dimensionality of samples from noisy low-dimensional manifolds in high dimensions with multiscale SVD , 2009, 2009 IEEE/SP 15th Workshop on Statistical Signal Processing.

[20]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[21]  Bernhard Schölkopf,et al.  Diffeomorphic Dimensionality Reduction , 2008, NIPS.

[22]  Andrew Rosalsky,et al.  On convergence properties of sums of dependent random variables under second moment and covariance restrictions , 2008 .

[23]  Miguel Á. Carreira-Perpiñán,et al.  Generalised blurring mean-shift algorithms for nonparametric clustering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Qingfu Zhang,et al.  Improving geodesic distance estimation based on locally linear assumption , 2008, Pattern Recognit. Lett..

[25]  Hongbin Zha,et al.  Riemannian Manifold Learning , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[27]  Deyu Meng,et al.  Estimating geodesic distances on locally linear patches , 2007, 2007 IEEE International Symposium on Signal Processing and Information Technology.

[28]  Matthias Hein,et al.  Manifold Denoising as Preprocessing for Finding Natural Representations of Data , 2007, AAAI.

[29]  Jing Wang,et al.  MLLE: Modified Locally Linear Embedding Using Multiple Weights , 2006, NIPS.

[30]  Matthias Hein,et al.  Manifold Denoising , 2006, NIPS.

[31]  Leon N. Cooper,et al.  A Minimum Sphere Covering Approach to Pattern Classification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[32]  Miguel Á. Carreira-Perpiñán,et al.  Fast nonparametric clustering with Gaussian blurring mean-shift , 2006, ICML.

[33]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[34]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[35]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[36]  Joachim Selbig,et al.  Non-linear PCA: a missing data approach , 2005, Bioinform..

[37]  Miguel Á. Carreira-Perpiñán,et al.  Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.

[38]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[39]  Li Yang,et al.  K-edge connected neighborhood graph for geodesic distance estimation and nonlinear data projection , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[40]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[41]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[42]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[43]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[44]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[45]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[46]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[47]  Nanda Kambhatla,et al.  Dimension Reduction by Local Principal Component Analysis , 1997, Neural Computation.

[48]  I. D. Coope,et al.  Circle fitting by linear and nonlinear least squares , 1993 .

[49]  D. Andrews Generic Uniform Convergence , 1992, Econometric Theory.

[50]  W. Newey,et al.  Uniform Convergence in Probability and Stochastic Equicontinuity , 1991 .

[51]  A. W. Davis ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS: NON-NORMAL CASE1 , 1977 .

[52]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[53]  T. W. Anderson ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS , 1963 .

[54]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[55]  Arlene K. H. Kim,et al.  Tight minimax rates for manifold estimation under Hausdorff loss , 2015 .

[56]  Seref Sagiroglu,et al.  The development of intuitive knowledge classifier and the modeling of domain dependent data , 2013, Knowl. Based Syst..

[57]  M. Maggioni,et al.  Multi-scale geometric methods for data sets II: Geometric Multi-Resolution Analysis , 2012 .

[58]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[59]  T. Hastie,et al.  Principal Curves , 2007 .

[60]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[61]  Jeanny Hérault,et al.  Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets , 1997, IEEE Trans. Neural Networks.

[62]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .