Competitive Learning Enriches Learning Representation and Accelerates the Fine-tuning of CNNs

In this study, we propose the integration of competitive learning into convolutional neural networks (CNNs) to improve the representation learning and efficiency of fine-tuning. Conventional CNNs use back propagation learning, and it enables powerful representation learning by a discrimination task. However, it requires huge amount of labeled data, and acquisition of labeled data is much harder than that of unlabeled data. Thus, efficient use of unlabeled data is getting crucial for DNNs. To address the problem, we introduce unsupervised competitive learning into the convolutional layer, and utilize unlabeled data for effective representation learning. The results of validation experiments using a toy model demonstrated that strong representation learning effectively extracted bases of images into convolutional filters using unlabeled data, and accelerated the speed of the fine-tuning of subsequent supervised back propagation learning. The leverage was more apparent when the number of filters was sufficiently large, and, in such a case, the error rate steeply decreased in the initial phase of fine-tuning. Thus, the proposed method enlarged the number of filters in CNNs, and enabled a more detailed and generalized representation. It could provide a possibility of not only deep but broad neural networks.

[1]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[2]  Shun-ichi Amari,et al.  A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[3]  Aapo Hyvärinen,et al.  A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images , 2001, Vision Research.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jonathan Tompson,et al.  Unsupervised Learning of Spatiotemporally Coherent Metrics , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[8]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[9]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[11]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[14]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[15]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[16]  David Zipser,et al.  Feature Discovery by Competive Learning , 1986, Cogn. Sci..

[17]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[18]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[19]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[20]  Antonio Torralba,et al.  SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.