Manifold Learning in Data Mining Tasks

Many Data Mining tasks deal with data which are presented in high dimensional spaces, and the ‘curse of dimensionality’ phenomena is often an obstacle to the use of many methods for solving these tasks. To avoid these phenomena, various Representation learning algorithms are used as a first key step in solutions of these tasks to transform the original high-dimensional data into their lower-dimensional representations so that as much information about the original data required for the considered Data Mining task is preserved as possible. The above Representation learning problems are formulated as various Dimensionality Reduction problems (Sample Embedding, Data Manifold embedding, Manifold Learning and newly proposed Tangent Bundle Manifold Learning) which are motivated by various Data Mining tasks. A new geometrically motivated algorithm that solves the Tangent Bundle Manifold Learning and gives new solutions for all the considered Dimensionality Reduction problems is presented.

[1]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[2]  Alan Julian Izenman,et al.  Introduction to manifold learning , 2012 .

[3]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[4]  Michael Biehl,et al.  Dimensionality reduction mappings , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[5]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[6]  Larry A. Wasserman,et al.  Minimax Manifold Estimation , 2010, J. Mach. Learn. Res..

[7]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[10]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[11]  Yunqian Ma,et al.  Manifold Learning Theory and Applications , 2011 .

[12]  Daniel Freedman,et al.  Efficient Simplicial Reconstructions of Manifolds from Their Samples , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[14]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[15]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[16]  Pascal Frossard,et al.  Tangent space estimation for smooth embeddings of Riemannian manifolds , 2012 .

[17]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[18]  Jorge S. Marques,et al.  Non-linear dimension reduction with tangent bundle approximation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[19]  I. Jolliffe Principal Component Analysis , 2002 .

[20]  Barbara Hammer,et al.  Out-of-sample kernel extensions for nonparametric dimensionality reduction , 2012, ESANN.

[21]  Dino Pedreschi,et al.  Machine Learning: ECML 2004 , 2004, Lecture Notes in Computer Science.

[22]  Alexander P. Kuleshov,et al.  Manifold Learning: Generalization Ability and Tangent Proximity , 2013, Int. J. Softw. Informatics.

[23]  John M. Lee Manifolds and Differential Geometry , 2009 .

[24]  Thomas Martinetz,et al.  Topology representing networks , 1994, Neural Networks.

[25]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  A. James Normal Multivariate Analysis and the Orthogonal Group , 1954 .

[27]  Ann B. Lee,et al.  Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  John M. Lee Introduction to Smooth Manifolds , 2002 .

[29]  Hariharan Narayanan,et al.  Sample Complexity of Testing the Manifold Hypothesis , 2010, NIPS.

[30]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[31]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[32]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[33]  João M. Lemos,et al.  A geometric approach to motion tracking in manifolds 1 , 2003 .

[34]  X. Huo,et al.  Electricity Price Curve Modeling and Forecasting by Manifold Learning , 2008, IEEE Transactions on Power Systems.

[35]  Andy J. Keane,et al.  A Study of Shape Parameterisation Methods for Airfoil Optimisation , 2004 .

[36]  A. Singer,et al.  Vector diffusion maps and the connection Laplacian , 2011, Communications on pure and applied mathematics.

[37]  Reyer Zwiggelaar,et al.  A Generalised Solution to the Out-of-Sample Extension Problem in Manifold Learning , 2011, AAAI.

[38]  Michel Verleysen,et al.  Quality assessment of dimensionality reduction: Rank-based criteria , 2009, Neurocomputing.

[39]  Alexander Kuleshov,et al.  Cognitive technologies in adaptive models of complex plants , 2009 .

[40]  R Hecht-Nielsen,et al.  Replicator neural networks for universal optimal source coding. , 1995, Science.

[41]  Ulrike von Luxburg,et al.  Proceedings of the 28th International Conference on Machine Learning, ICML 2011 , 2011, International Conference on Machine Learning.

[42]  Dimitris Achlioptas,et al.  Random Matrices in Data Analysis , 2004, PKDD.

[43]  X. Huo,et al.  A Survey of Manifold-Based Learning Methods , 2007 .

[44]  Pascal Vincent,et al.  The Manifold Tangent Classifier , 2011, NIPS.

[45]  Nicolas Le Roux,et al.  Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[46]  Pascal Frossard,et al.  Tangent-based manifold approximation with locally linear models , 2012, Signal Process..

[47]  Matthew Brand,et al.  Charting a Manifold , 2002, NIPS.

[48]  Alexander P. Kuleshov,et al.  Tangent Bundle Manifold Learning via Grassmann&Stiefel Eigenmaps , 2012, ArXiv.

[49]  Christopher J. C. Burges,et al.  Dimension Reduction: A Guided Tour , 2010, Found. Trends Mach. Learn..

[50]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[51]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[52]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[53]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[54]  Jorge S. Marques,et al.  Selecting Landmark Points for Sparse Manifold Learning , 2005, NIPS.

[55]  Liwei Wang,et al.  Subspace distance analysis with application to adaptive Bayesian algorithm for face recognition , 2006, Pattern Recognit..

[56]  Lior Wolf,et al.  Learning over Sets using Kernel Principal Angles , 2003, J. Mach. Learn. Res..

[57]  Kilian Q. Weinberger,et al.  Spectral Methods for Dimensionality Reduction , 2006, Semi-Supervised Learning.

[58]  Gene H. Golub,et al.  Matrix computations , 1983 .