Shallow and deep learning for image classification

The paper is focused on the idea to demonstrate the advantages of deep learning approaches over ordinary shallow neural network on their comparative applications to image classifying from such popular benchmark databases as FERET and MNIST. An autoassociative neural network is used as a standalone program realized the nonlinear principal component analysis for prior extracting the most informative features of input data for neural networks to be compared further as classifiers. A special study of the optimal choice of activation function and the normalization transformation of input data allows to improve efficiency of the autoassociative program. One more study devoted to denoising properties of this program demonstrates its high efficiency even on noisy data. Three types of neural networks are compared: feed-forward neural net with one hidden layer, deep network with several hidden layers and deep belief network with several pretraining layers realized restricted Boltzmann machine. The number of hidden layer and the number of hidden neurons in them were chosen by cross-validation procedure to keep balance between number of layers and hidden neurons and classification efficiency. Results of our comparative study demonstrate the undoubted advantage of deep networks, as well as denoising power of autoencoders. In our work we use both multiprocessor graphic card and cloud services to speed up our calculations. The paper is oriented to specialists in concrete fields of scientific or experimental applications, who have already some knowledge about artificial neural networks, probability theory and numerical methods.

[1]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[2]  M. Gyulassy,et al.  Elastic tracking and neural network algorithms for complex pattern recognition , 1991 .

[3]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[4]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[5]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[6]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.

[7]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[8]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[9]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[10]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[11]  Carsten Peterson,et al.  Track finding with deformable templates , 1991 .

[12]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[13]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[14]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[15]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[16]  I. V. Kisel,et al.  Applications of neural networks in experimental physics , 1993 .

[17]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[18]  Yuhanis Yusof,et al.  A comparison of normalization techniques in predicting dengue outbreak , 2010 .

[19]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[22]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[23]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[24]  P. Lennie The Cost of Cortical Computation , 2003, Current Biology.

[25]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Bodo W. Reinisch,et al.  Feedback neural networks for ARTIST ionogram processing , 1996 .

[27]  A. Yuille,et al.  Track finding with deformable templates — the elastic arms approach , 1992 .

[28]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[29]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[31]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[32]  S. Baginyan,et al.  Tracking by a modified rotor model of neural network , 1994 .

[33]  Bruce Denby,et al.  Neural networks and cellular automata in experimental high energy physics , 1988 .

[34]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[35]  R. G. Rice,et al.  Bubble size prediction for rigid and flexible spargers , 1991 .

[36]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[37]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[38]  Jürgen Schmidhuber,et al.  Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[39]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Wen-Huang Cheng,et al.  Computer-aided classification of lung nodules on computed tomography images via deep learning technique , 2015, OncoTargets and therapy.

[41]  A. Lebedev,et al.  Electron reconstruction and identification capabilities of the CBM Experiment at FAIR , 2012 .

[42]  O. W. Caldwell,et al.  THE CENTRAL ASSOCIATION OF SCIENCE AND MATHEMATICS TEACHERS , 1905 .

[43]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[44]  Carsten Peterson,et al.  Explorations of the mean field theory learning algorithm , 1989, Neural Networks.

[45]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[46]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[47]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[48]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[49]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[50]  Persi Diaconis,et al.  The Markov chain Monte Carlo revolution , 2008 .

[51]  Yu-Bin Yang,et al.  Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections , 2016, ArXiv.

[52]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[53]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[54]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[55]  C. Peterson Track finding with neural networks , 1989 .

[56]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[57]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[58]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[59]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[60]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[61]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[62]  Gennadii A. Ososkov Robust tracking by cellular automata and neural networks with nonlocal weights , 1995, SPIE Defense + Commercial Sensing.