论文信息 - SpinalNet: Deep Neural Network with Gradual Input

SpinalNet: Deep Neural Network with Gradual Input

Over the past few years, deep neural networks (DNNs) have garnered remarkable success in a diverse range of real-world applications. However, DNNs consider a large number of inputs and consist of a large number of parameters, resulting in high computational demand. We study the human somatosensory system and propose the SpinalNet to achieve higher accuracy with less computational resources. In a typical neural network (NN) architecture, the hidden layers receive inputs in the first layer and then transfer the intermediate outcomes to the next layer. In the proposed SpinalNet, the structure of hidden layers allocates to three sectors: 1) Input row, 2) Intermediate row, and 3) output row. The intermediate row of the SpinalNet contains a few neurons. The role of input segmentation is in enabling each hidden layer to receive a part of the inputs and outputs of the previous layer. Therefore, the number of incoming weights in a hidden layer is significantly lower than traditional DNNs. As all layers of the SpinalNet directly contributes to the output row, the vanishing gradient problem does not exist. We also investigate the SpinalNet fully-connected layer to several well-known DNN models and perform traditional learning and transfer learning. We observe significant error reductions with lower computational costs in most of the DNNs. We have also obtained the state-of-the-art (SOTA) performance for QMNIST, Kuzushiji-MNIST, EMNIST (Letters, Digits, and Balanced), STL-10, Bird225, Fruits 360, and Caltech-101 datasets. The scripts of the proposed SpinalNet are available with the following link: this https URL

[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2] Houman Owhadi,et al. Deep regularization and direct training of the inner layers of Neural Networks with Kernel Flows , 2020, ArXiv.

[3] Amos J. Storkey,et al. School of Informatics, University of Edinburgh , 2022 .

[4] Arild Nøkland,et al. Training Neural Networks with Local Error Signals , 2019, ICML.

[5] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[6] Jonathon S. Hare,et al. FMix: Enhancing Mixed Sample Data Augmentation , 2020 .

[7] Ah Chung Tsoi,et al. Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[8] Thomas Wiatowski,et al. A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction , 2015, IEEE Transactions on Information Theory.

[9] Mark Shafarenko,et al. Vascularized Brachial Plexus Allotransplantation—An Experimental Study in Brown Norway and Lewis Rats , 2019, Transplantation.

[10] Gregory Cohen,et al. EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[11] Davide Di Ruscio,et al. Automated fruit recognition using EfficientNet and MixNet , 2020, Comput. Electron. Agric..

[12] Lucas Beyer,et al. Big Transfer (BiT): General Visual Representation Learning , 2020, ECCV.

[13] Fatih Porikli,et al. A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning , 2017, IEEE Transactions on Image Processing.

[14] Saeid Nahavandi,et al. Partial Adversarial Training for Neural Network-Based Uncertainty Quantification , 2021, IEEE Transactions on Emerging Topics in Computational Intelligence.

[15] Mehryar Mohri,et al. AdaNet: Adaptive Structural Learning of Artificial Neural Networks , 2016, ICML.

[16] Jonathon S. Hare,et al. Understanding and Enhancing Mixed Sample Data Augmentation , 2020, ArXiv.

[17] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[18] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Christian Büchel,et al. Attention Modulates Spinal Cord Responses to Pain , 2012, Current Biology.

[20] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21] Donald E. Brown,et al. RMDL: Random Multimodel Deep Learning for Classification , 2018, ICISDM '18.

[22] Thomas Wolf,et al. TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[23] Supratik Mukhopadhyay,et al. Unsupervised Learning using Pretrained CNN and Associative Memory Bank , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[24] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Anton van den Hengel,et al. Wider or Deeper: Revisiting the ResNet Model for Visual Recognition , 2016, Pattern Recognit..

[26] L'eon Bottou,et al. Cold Case: The Lost MNIST Digits , 2019, NeurIPS.

[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28] Adam Byerly,et al. No routing needed between capsules , 2020, Neurocomputing.

[29] L. Deng,et al. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[30] Quoc V. Le,et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.

[31] Zhichao Lu,et al. Neural Architecture Transfer , 2021, IEEE transactions on pattern analysis and machine intelligence.

[32] A. Dickenson,et al. Spinal cord mechanisms of pain. , 2008, British journal of anaesthesia.

[33] Saeid Nahavandi,et al. Neural Network-Based Uncertainty Quantification: A Survey of Methodologies and Applications , 2018, IEEE Access.

[34] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Luc Van Gool,et al. Towards End-to-End Lane Detection: an Instance Segmentation Approach , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[36] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[37] Kumara Kahatapitiya,et al. Context-Aware Multipath Networks , 2019, ArXiv.

[38] Michael Vogt,et al. An Overview of Deep Learning and Its Applications , 2019, Proceedings.

[39] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] Mihai Oltean,et al. Fruit recognition from images using deep learning , 2017, Acta Universitatis Sapientiae, Informatica.

[41] Preman Ghadekar,et al. Handwritten Digit and Letter Recognition Using Hybrid DWT-DCT with KNN and SVM Classifier , 2018, 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA).

[42] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[43] David A. Landgrebe. Training a Classifier , 2005 .

[44] Qi Tian,et al. Simple Techniques Make Sense: Feature Pooling and Normalization for Image Classification , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[45] Yoji Yamada,et al. Psychophysical Dimensions of Tactile Perception of Textures , 2013, IEEE Transactions on Haptics.

[46] Stefanie Jegelka,et al. ResNet with one-neuron hidden layers is a Universal Approximator , 2018, NeurIPS.

[47] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[48] M. Vohra,et al. Subspace-based dimension reduction for chemical kinetics applications with epistemic uncertainty , 2018 .

[49] Ge Wang,et al. Universal Approximation with Quadratic Deep Networks , 2018, Neural Networks.

[50] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[51] Saeid Nahavandi,et al. Optimal Autonomous Driving Through Deep Imitation Learning and Neuroevolution , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).

[52] Quoc V. Le,et al. Domain Adaptive Transfer Learning with Specialist Models , 2018, ArXiv.

[53] Alex Lamb,et al. Deep Learning for Classical Japanese Literature , 2018, ArXiv.

[54] Dongbin Zhao,et al. StarCraft Micromanagement With Reinforcement Learning and Curriculum Transfer Learning , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[55] Ranga Rodrigo,et al. TextCaps: Handwritten Character Recognition With Very Small Datasets , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).