Convolutional Visual Feature Learning: A Compositional Subspace Representation Perspective

The main contribution of this paper is to provide a new perspective to understand the end-to-end convolutional visual feature learning in a convolutional neural network (ConvNet) using empirical feature map analysis. The analysis is performed through a novel mathod called compositional subspace model using a minimal ConvNet. This method allows us to better understand how a ConvNet learn visual features in a hierarchical manner. A handwritten digit recognition using MNIST dataset is used to experiment the empirical feature map analysis. The experimental results conclude our proposal on using the compositional subspace model to visually understand the convolutional visual feature learning in a ConvNet.

[1]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[2]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[3]  Takayuki Ito,et al.  Neocognitron: A neural network model for a mechanism of visual pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[5]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[6]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[7]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[9]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.