Gene expression Non-linear PCA : a missing data approach

Motivation:Visualizingandanalysing thepotential non-linear structure of a dataset is becoming an important task in molecular biology. This is even more challenging when the data have missing values. Results: Here, we propose an inverse model that performs non-linear principal component analysis (NLPCA) from incomplete datasets. Missing values are ignored while optimizing the model, but can be estimated afterwards. Results are shown for both artificial and experimentaldatasets. Incontrast to linearmethods,non-linearmethodswere able to give better missing value estimations for non-linear structured

[1]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[2]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[3]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[4]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[5]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[6]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[7]  R Hecht-Nielsen,et al.  Replicator neural networks for universal optimal source coding. , 1995, Science.

[8]  R. Miranda,et al.  Circular Nodes in Neural Networks , 1996, Neural Computation.

[9]  H. Sebastian Seung,et al.  Learning Generative Models with the Up-Propagation Algorithm , 1997, NIPS.

[10]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.

[11]  E. C. Malthouse,et al.  Limitations of nonlinear PCA as performed with generic neural networks , 1998, IEEE Trans. Neural Networks.

[12]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[13]  Charles M. Bishop Variational principal components , 1999 .

[14]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[15]  Antti Honkela,et al.  Bayesian Non-Linear Independent Component Analysis by Multi-Layer Perceptrons , 2000 .

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  Tapani Raiko,et al.  Missing Values in Nonlinear Factor Analysis , 2001 .

[18]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[19]  Ricardo Vigário,et al.  Nonlinear PCA: a new hierarchical approach , 2002, ESANN.

[20]  Jakob Verbeek,et al.  Procrustes Analysis to Coordinate Mixtures of Probabilistic Principal Component Analyzers , 2002 .

[21]  Xiaobo Zhou,et al.  Missing-value estimation using linear and non-linear regression with Bayesian gene selection , 2003, Bioinform..

[22]  Marek Kimmel,et al.  A note on estimation of dynamics of multiple gene expression based on singular value decomposition. , 2003, Mathematical biosciences.

[23]  William W. Hsieh,et al.  Nonlinear multivariate and time series analysis by neural network methods , 2004 .

[24]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[25]  Shin Ishii,et al.  A Bayesian missing value estimation method for gene expression profile data , 2003, Bioinform..

[26]  Adam H. Monahan,et al.  The Vertical Structure of Wintertime Climate Regimes of the Northern Hemisphere Extratropical Atmosphere , 2003 .

[27]  Antti Honkela,et al.  Unsupervised Variational Bayesian Learning of Nonlinear Models , 2004, NIPS.

[28]  A. Gámez,et al.  Nonlinear dimensionality reduction in climate data , 2004 .

[29]  F. Kaplan,et al.  Exploring the Temperature-Stress Metabolome of Arabidopsis , 2004 .

[30]  F. Carrari,et al.  Zooming In on a Quantitative Trait for Tomato Yield Using Interspecific Introgressions , 2004, Science.

[31]  Joachim Selbig,et al.  Non-linear PCA: a missing data approach , 2005, Bioinform..

[32]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[33]  T. Hastie,et al.  Principal Curves , 2007 .