Validation of Nonlinear PCA

Linear principal component analysis (PCA) can be extended to a nonlinear PCA by using artificial neural networks. But the benefit of curved components requires a careful control of the model complexity. Moreover, standard techniques for model selection, including cross-validation and more generally the use of an independent test set, fail when applied to nonlinear PCA because of its inherent unsupervised characteristics. This paper presents a new approach for validating the complexity of nonlinear PCA models by using the error in missing data estimation as a criterion for model selection. It is motivated by the idea that only the model of optimal complexity is able to predict missing values with the highest accuracy. While standard test set validation usually favours over-fitted nonlinear PCA models, the proposed model validation approach correctly selects the optimal model complexity.

[1]  Bernard Chalmond,et al.  Nonlinear Modeling of Scattered Multivariate Data and Its Application to Shape Change , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Bo Christiansen,et al.  The Shortcomings of Nonlinear Principal Component Analysis in Identifying Circulation Regimes , 2005 .

[3]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[4]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[5]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[6]  Geoffrey E. Hinton Learning Translation Invariant Recognition in Massively Parallel Networks , 1987, PARLE.

[7]  R Hecht-Nielsen,et al.  Replicator neural networks for universal optimal source coding. , 1995, Science.

[8]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[9]  Klaus-Robert Müller,et al.  Injecting noise for analysing the stability of ICA components , 2004, Signal Process..

[10]  William W. Hsieh,et al.  Nonlinear principal component analysis by neural networks , 2001 .

[11]  T. Hastie,et al.  Principal Curves , 2007 .

[12]  Tapani Raiko,et al.  Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values , 2022 .

[13]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[14]  M. Naderi Think globally... , 2004, HIV prevention plus!.

[15]  Agnieszka Herman Nonlinear principal component analysis of the tidal dynamics in a shallow sea , 2007 .

[16]  Serge Iovleff,et al.  Auto-associative models and generalized principal component analysis , 2005 .

[17]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[18]  William W. Hsieh,et al.  Nonlinear principal component analysis of noisy data , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[19]  William W. Hsieh Nonlinear principal component analysis of noisy data , 2007, Neural Networks.

[20]  William W. Hsieh,et al.  Nonlinear atmospheric teleconnections , 2006 .

[21]  Joachim Selbig,et al.  Non-linear PCA: a missing data approach , 2005, Bioinform..

[22]  Shay B. Cohen,et al.  Advances in Neural Information Processing Systems 25 , 2012, NIPS 2012.

[23]  Jeanny Hérault,et al.  Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets , 1997, IEEE Trans. Neural Networks.

[24]  Antti Honkela,et al.  Unsupervised Variational Bayesian Learning of Nonlinear Models , 2004, NIPS.

[25]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[26]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[27]  Pierre-Antoine Absil,et al.  Principal Manifolds for Data Visualization and Dimension Reduction , 2007 .

[28]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[29]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[30]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[31]  Ricardo Vigário,et al.  Nonlinear PCA: a new hierarchical approach , 2002, ESANN.

[32]  Matthias Scholz,et al.  A computational model of gene expression reveals early transcriptional events at the subtelomeric regions of the malaria parasite, Plasmodium falciparum , 2008, Genome Biology.

[33]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[34]  Garrison W. Cottrell,et al.  Non-Linear Dimensionality Reduction , 1992, NIPS.

[35]  Bei-Wei Lu,et al.  Quasi-objective nonlinear principal component analysis , 2011, Neural Networks.