Manifold blurring mean shift algorithms for manifold denoising

We propose a new family of algorithms for denoising data assumed to lie on a low-dimensional manifold. The algorithms are based on the blurring mean-shift update, which moves each data point towards its neighbors, but constrain the motion to be orthogonal to the manifold. The resulting algorithms are nonparametric, simple to implement and very effective at removing noise while preserving the curvature of the manifold and limiting shrinkage. They deal well with extreme outliers and with variations of density along the manifold. We apply them as preprocessing for dimensionality reduction; and for nearest-neighbor classification of MNIST digits, with consistent improvements up to 36% over the original data.

[1]  Miguel Á. Carreira-Perpiñán,et al.  Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.

[2]  Matthew J. Sottile,et al.  Curve and surface reconstruction: algorithms with mathematical analysis by Tamal K. Dey Cambridge University Press , 2010, SIGA.

[3]  Markus H. Gross,et al.  Point-based multiscale surface representation , 2006, TOGS.

[4]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[5]  P. Hall,et al.  Data sharpening as a prelude to density estimation , 1999 .

[6]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[8]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[9]  H. Zha,et al.  Local smoothing for manifold learning , 2004, CVPR 2004.

[10]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[11]  D. Levin,et al.  Mesh-Independent Surface Interpolation , 2004 .

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Mark Meyer,et al.  Implicit fairing of irregular meshes using diffusion and curvature flow , 1999, SIGGRAPH.

[14]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[15]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[16]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[17]  Konrad Polthier,et al.  Anisotropic smoothing of point sets, , 2005, Comput. Aided Geom. Des..

[18]  Martial Hebert,et al.  Denoising Manifold and Non-Manifold Point Clouds , 2007, BMVC.

[19]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[20]  Gabriel Taubin,et al.  A signal processing approach to fair surface design , 1995, SIGGRAPH.

[21]  Miguel Á. Carreira-Perpiñán,et al.  Fast nonparametric clustering with Gaussian blurring mean-shift , 2006, ICML.

[22]  Miguel Á. Carreira-Perpiñán,et al.  Generalised blurring mean-shift algorithms for nonparametric clustering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Peter Meer,et al.  Nonlinear Mean Shift over Riemannian Manifolds , 2009, International Journal of Computer Vision.

[24]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[25]  Matthias Hein,et al.  Manifold Denoising , 2006, NIPS.