Assessing Intelligence in Artificial Neural Networks

The purpose of this work was to develop of metrics to assess network architectures that balance neural network size and task performance. To this end, the concept of neural efficiency is introduced to measure neural layer utilization, and a second metric called artificial intelligence quotient (aIQ) was created to balance neural network performance and neural network efficiency. To study aIQ and neural efficiency, two simple neural networks were trained on MNIST: a fully connected network (LeNet-300-100) and a convolutional neural network (LeNet-5). The LeNet-5 network with the highest aIQ was 2.32% less accurate but contained 30,912 times fewer parameters than the highest accuracy network. Both batch normalization and dropout layers were found to increase neural efficiency. Finally, high aIQ networks are shown to be memorization and overtraining resistant, capable of learning proper digit classification with an accuracy of 92.51% even when 75% of the class labels are randomized. These results demonstrate the utility of aIQ and neural efficiency as metrics for balancing network performance and size.

[1]  Olatunji Ruwase,et al.  ZeRO: Memory Optimization Towards Training A Trillion Parameter Models , 2019, SC.

[2]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[3]  Amos J. Storkey,et al.  Data Augmentation Generative Adversarial Networks , 2017, ICLR 2018.

[4]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Peter König,et al.  Further Advantages of Data Augmentation on Convolutional Neural Networks , 2018, ICANN.

[7]  Nojun Kwak,et al.  Analysis on the Dropout Effect in Convolutional Neural Networks , 2016, ACCV.

[8]  M. Buchsbaum,et al.  Regional glucose metabolic changes after learning a complex visuospatial/motor task: a positron emission tomographic study , 1992, Brain Research.

[9]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[10]  M. Buchsbaum,et al.  Cortical glucose metabolic rate correlates of abstract reasoning and attention studied with positron emission tomography , 1988 .

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[15]  Shahrokh Valaee,et al.  Survey of Dropout Methods for Deep Neural Networks , 2019, ArXiv.

[16]  M. Buchsbaum,et al.  Intelligence and changes in regional cerebral glucose metabolic rate following learning , 1992 .

[17]  Rui Peng,et al.  Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.

[18]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[19]  A. Neubauer,et al.  Intelligence and neural efficiency , 2009, Neuroscience & Biobehavioral Reviews.

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..