NICE: Non-linear Independent Components Estimation

We propose a deep learning framework for modeling complex high-dimensional densities called Non-linear Independent Component Estimation (NICE). It is based on the idea that a good representation is one in which the data has a distribution that is easy to model. For this purpose, a non-linear deterministic transformation of the data is learned that maps it to a latent space so as to make the transformed data conform to a factorized distribution, i.e., resulting in independent latent variables. We parametrize this transformation so that computing the Jacobian determinant and inverse transform is trivial, yet we maintain the ability to learn complex non-linear transformations, via a composition of simple building blocks, each based on a deep neural network. The training criterion is simply the exact log-likelihood, which is tractable. Unbiased ancestral sampling is also easy. We show that this approach yields good generative models on four image datasets and can be used for inpainting.

[1]  Yoshua Bengio,et al.  Artificial neural networks and their application to sequence recognition , 1991 .

[2]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[3]  S. Mallat A wavelet tour of signal processing , 1998 .

[4]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[5]  Samy Bengio,et al.  Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks , 1999, NIPS.

[6]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[7]  Juha Karhunen,et al.  Nonlinear Independent Component Analysis Using Ensemble Learning: Experiments and Discussion , 2000 .

[8]  Richard M. Everson,et al.  Independent Component Analysis: Principles and Practice , 2001 .

[9]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[10]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[11]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[12]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[13]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[14]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[15]  A. Krizhevsky Convolutional Deep Belief Networks on CIFAR-10 , 2010 .

[16]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[17]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[18]  Razvan Pascanu,et al.  Theano: Deep Learning on GPUs with Python , 2012 .

[19]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[20]  Geoffrey E. Hinton,et al.  Deep Mixtures of Factor Analysers , 2012, ICML.

[21]  Hugo Larochelle,et al.  RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[22]  Ryan P. Adams,et al.  High-Dimensional Probability Estimation with Deep Density Models , 2013, ArXiv.

[23]  Ian J. Goodfellow,et al.  Pylearn2: a machine learning research library , 2013, ArXiv.

[24]  Yoshua Bengio,et al.  Better Mixing via Deep Representations , 2012, ICML.

[25]  Ruslan Salakhutdinov,et al.  Annealing between distributions by averaging moments , 2013, NIPS.

[26]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[27]  Max Welling,et al.  Learning the Irreducible Representations of Commutative Lie Groups , 2014, ICML.

[28]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[29]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[30]  Daan Wierstra,et al.  Deep AutoRegressive Networks , 2013, ICML.

[31]  Yoshua Bengio,et al.  How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation , 2014, ArXiv.

[32]  Yoshua Bengio,et al.  Deep Directed Generative Autoencoders , 2014, ArXiv.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.