A Generative Process for sampling Contractive Auto-Encoders

The contractive auto-encoder learns a representation of the input data that captures the local manifold structure around each data point, through the leading singular vectors of the Jacobian of the transformation from input to representation. The corresponding singular values specify how much local variation is plausible in directions associated with the corresponding singular vectors, while remaining in a high-density region of the input space. This paper proposes a procedure for generating samples that are consistent with the local structure captured by a contractive auto-encoder. The associated stochastic process defines a distribution from which one can sample, and which experimentally appears to converge quickly and mix well between modes, compared to Restricted Boltzmann Machines and Deep Belief Networks. The intuitions behind this procedure can also be used to train the second layer of contraction that pools lower-level features and learns to be invariant to the local directions of variation discovered in the first layer. We show that this can help learn and represent invariances present in the data and improve classification error.

[1]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[2]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[3]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[4]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[5]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[6]  Pascal Vincent,et al.  Non-Local Manifold Parzen Windows , 2005, NIPS.

[7]  Yoshua Bengio,et al.  Nonlocal Estimation of Manifold Structure , 2006, Neural Computation.

[8]  Yoshua Bengio,et al.  Non-Local Manifold Tangent Learning , 2004, NIPS.

[9]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[10]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[11]  Andrew Y. Ng,et al.  Selecting Receptive Fields in Deep Networks , 2011, NIPS.

[12]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[13]  Pascal Vincent,et al.  Quickly Generating Representative Samples from an RBM-Derived Process , 2011, Neural Computation.

[14]  Yann LeCun,et al.  Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.

[15]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[16]  Hariharan Narayanan,et al.  Sample Complexity of Testing the Manifold Hypothesis , 2010, NIPS.

[17]  Geoffrey E. Hinton,et al.  On deep generative models with applications to recognition , 2011, CVPR 2011.

[18]  Pascal Vincent,et al.  The Manifold Tangent Classifier , 2011, NIPS.