论文信息 - Gated Autoencoders with Tied Input Weights

Gated Autoencoders with Tied Input Weights

The semantic interpretation of images is one of the core applications of deep learning. Several techniques have been recently proposed to model the relation between two images, with application to pose estimation, action recognition or invariant object recognition. Among these techniques, higher-order Boltzmann machines or relational autoencoders consider projections of the images on different subspaces and intermediate layers act as transformation specific detectors. In this work, we extend the mathematical study of (Memisevic, 2012b) to show that it is possible to use a unique projection for both images in a way that turns intermediate layers as spectrum encoders of transformations. We show that this results in networks that are easier to tune and have greater generalization capabilities.

Olivier Sigaud | Alain Droniou

[1] Misha Denil,et al. Learning Where to Attend with Deep Architectures for Image Tracking , 2011, Neural Computation.

[2] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[3] Geoffrey E. Hinton,et al. Modeling the joint density of two images under a variety of transformations , 2011, CVPR 2011.

[4] Pierre Baldi,et al. Complex-Valued Autoencoders , 2011, Neural Networks.

[5] Matthias Bethge,et al. Unsupervised learning of a steerable basis for invariant image representations , 2007, Electronic Imaging.

[6] Geoffrey E. Hinton,et al. Unsupervised Learning of Image Transformations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.

[8] Geoffrey E. Hinton,et al. Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.

[9] Geoffrey E. Hinton,et al. Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[10] Pierre Baldi,et al. Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[11] Christian Wolf,et al. Sequential Deep Learning for Human Action Recognition , 2011, HBU.