Dimension estimation of image manifolds by minimal cover approximation

Estimating intrinsic dimension of data is an important problem in feature extraction and feature selection. It provides an estimation of the number of desired features. Principal Components Analysis (PCA) is a powerful tool in discovering the dimension of data sets with a linear structure; it, however, becomes ineffective when data have a nonlinear structure. In this paper, we propose a new PCA-based method to estimate the embedding dimension of data with nonlinear structures. Our method works by first finding a minimal cover of the data set, then performing PCA locally on each subset in the cover to obtain local intrinsic dimension estimations and finally giving the estimation result as the average of the local estimations. There are two main innovations in our method. (1) A novel noise filtering procedure is applied in the PCA procedure for local intrinsic dimension estimation. (2) A minimal cover is constructed over the whole data set. Because of these two innovations, our method is fast, robust to noise and outliers, converges to a stable estimation with a wide range of sub-region sizes and can be used in the incremental sense, where the subregion refers to the local approximation of the distributed manifold. Experiments on synthetic and image data sets show effectiveness of the proposed method.

[1]  Yi Yang,et al.  Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[2]  Charles Bouveyron,et al.  Intrinsic dimension estimation by maximum likelihood in isotropic probabilistic PCA , 2011, Pattern Recognit. Lett..

[3]  Balázs Kégl,et al.  Intrinsic Dimension Estimation Using Packing Numbers , 2002, NIPS.

[4]  Gerald Sommer,et al.  Intrinsic Dimensionality Estimation With Optimally Topology Preserving Maps , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  ZhangChangshui,et al.  Flexible manifold embedding , 2010 .

[6]  Feiping Nie,et al.  Embedding new data points for manifold learning via coordinate propagation , 2007, Knowledge and Information Systems.

[7]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[8]  Wenhua Wang,et al.  Local and Global Regressive Mapping for Manifold Learning with Out-of-Sample Extrapolation , 2010, AAAI.

[9]  Alfred O. Hero,et al.  Geodesic entropic graphs for dimension and entropy estimation in manifold learning , 2004, IEEE Transactions on Signal Processing.

[10]  Hongbin Zha,et al.  Riemannian Manifold Learning , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Anil K. Jain,et al.  An Intrinsic Dimensionality Estimator from Near-Neighbor Information , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Feiping Nie,et al.  Regression Reformulations of LLE and LTSA With Locally Linear Transformation , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  P. Grassberger,et al.  Measuring the Strangeness of Strange Attractors , 1983 .

[14]  Feiping Nie,et al.  Neighborhood MinMax Projections , 2007, IJCAI.

[15]  Waldemar Karwowski,et al.  Estimating intrinsic dimensionality using the multi-criteria decision weighted model and the average standard estimator , 2010, Inf. Sci..

[16]  Thorsten M. Buzug,et al.  Characterising experimental time series using local intrinsic dimension , 1995 .

[17]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[18]  H. Sebastian Seung,et al.  The Manifold Ways of Perception , 2000, Science.

[19]  Keinosuke Fukunaga,et al.  An Algorithm for Finding Intrinsic Dimensionality of Data , 1971, IEEE Transactions on Computers.

[20]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[21]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[22]  Feiping Nie,et al.  Nonlinear Dimensionality Reduction with Local Spline Embedding , 2009, IEEE Transactions on Knowledge and Data Engineering.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Bo Zhang,et al.  Intrinsic dimension estimation of manifolds by incising balls , 2009, Pattern Recognit..

[25]  Ivor W. Tsang,et al.  Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction , 2010, IEEE Transactions on Image Processing.

[26]  I. Jolliffe Principal Component Analysis , 2002 .

[27]  Siu-Wing Cheng,et al.  Dimension detection via slivers , 2009, SODA.

[28]  Jose A. Costa,et al.  Estimating Local Intrinsic Dimension with k-Nearest Neighbor Graphs , 2005, IEEE/SP 13th Workshop on Statistical Signal Processing, 2005.

[29]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[30]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[31]  Svetlana Lazebnik,et al.  Estimation of Intrinsic Dimensionality Using High-Rate Vector Quantization , 2005, NIPS.

[32]  Francesco Camastra,et al.  Data dimensionality estimation methods: a survey , 2003, Pattern Recognit..

[33]  Csaba Szepesvári,et al.  Manifold-Adaptive Dimension Estimation , 2007, ICML '07.

[34]  Yajun Wang,et al.  Provable dimension detection using principal component analysis , 2005, Symposium on Computational Geometry.

[35]  Matthias Hein,et al.  Intrinsic dimensionality estimation of submanifolds in Rd , 2005, ICML.