论文信息 - Manifold Denoising as Preprocessing for Finding Natural Representations of Data

Manifold Denoising as Preprocessing for Finding Natural Representations of Data

A natural representation of data is given by the parameters which generated the data. If the space of parameters is continuous, then we can regard it as a manifold. In practice, we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. However, the data is usually corrupted by noise, which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising, which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover, we will show that using the denoising method as a preprocessing step, one can significantly improve the results of a semi-supervised learning algorithm.

Matthias Hein | Markus Maier | Matthias Hein | Markus Maier

[1] Bernhard Schölkopf,et al. Learning with Local and Global Consistency , 2003, NIPS.

[2] Mikhail Belkin,et al. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[3] P. Grassberger,et al. Measuring the Strangeness of Strange Attractors , 1983 .

[4] Christopher M. Bishop,et al. GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[5] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[6] Matthias Hein,et al. Manifold Denoising , 2006, NIPS.

[7] Gabriel Taubin,et al. A signal processing approach to fair surface design , 1995, SIGGRAPH.

[8] Ulrike von Luxburg,et al. From Graphs to Manifolds - Weak and Strong Pointwise Consistency of Graph Laplacians , 2005, COLT.

[9] Hermann Ney,et al. Adaptation in statistical pattern recognition using tangent vectors , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] T. Hastie,et al. Principal Curves , 2007 .