论文信息 - Transforming Autoencoders

Transforming Autoencoders

One way to design an object recognition system is to define objects recursively in terms of their parts and the required spatial relationships between the parts and the whole. A natural way for a neural network to implement this knowledge is by using a matrix of weights to represent each part-whole relationship and a vector of neural activities to represent the pose of each part or whole relative to the viewer [10]. This leads to neural networks that can recognize objects over a wide range of viewpoints using neural activities that are “equivariant” rather than invariant: as the viewpoint varies the neural activities all vary even though the knowledge in the weights is viewpoint-invariant. The “capsules” that implement the lowest-level parts in the shape hierarchy need to extract explicit pose parameters from pixel intensities. This paper shows that these capsules are quite easy to learn from pairs of transformed images if the neural net has direct, non-visual access to the transformations.

Geoffrey E. Hinton | A. Krizhevsky | S. D. Wang

[1] Geoffrey E. Hinton. Shape Representation in Parallel Systems , 1981, IJCAI.

[2] Edward H. Adelson,et al. The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[3] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4] T. Poggio,et al. Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[5] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6] D. Pelli,et al. The uncrowded window of object recognition , 2008, Nature Neuroscience.

[7] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[8] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[9] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.