Experimenting Deep Convolutional Visual Feature Learning using Compositional Subspace Representation and Fashion-MNIST

This paper introduces a formal framework to model the convolutional visual feature learning in a convolutional neural network, which is called compositional subspace representation. The objective is to explain the convolutional visual feature learning computation using a rigid and structural method. The theoretical basis of the proposed framework is, the best way for representation to model a complex learning function is by using a composition of simple two-dimensional piecewise-linear functions to form a multilayers successive cascaded projection function for complex representation. Under the same hypothesis, the proposed framework also explains the hierarchical feature learning representation in a convolutional neural network, the well-acknowledged significant advantage of convolutional neural networks in visual computing. The proposed framework has experimented with image classification using the Fashion-MNIST dataset. Experimental assessments using learning curves analysis, confusion matrix, and visual assessment are presented and discussed. The experimental results were consistent with the theoretical expectation.

[1]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[2]  Matthew Y. W. Teow,et al.  Convolutional Visual Feature Learning: A Compositional Subspace Representation Perspective , 2018 .

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[5]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[6]  Matthew Y. W. Teow,et al.  Understanding convolutional neural networks using a minimal model for handwritten digit recognition , 2017, 2017 IEEE 2nd International Conference on Automatic Control and Intelligent Systems (I2CACIS).

[7]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[8]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Takayuki Ito,et al.  Neocognitron: A neural network model for a mechanism of visual pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[15]  S. Kolassa Two Cheers for Rebooting AI: Building Artificial Intelligence We Can Trust , 2020 .

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[18]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.