What does it mean to understand a neural network?

We can define a neural network that can learn to recognize objects in less than 100 lines of code. However, after training, it is characterized by millions of weights that contain the knowledge about many object types across visual scenes. Such networks are thus dramatically easier to understand in terms of the code that makes them than the resulting properties, such as tuning or connections. In analogy, we conjecture that rules for development and learning in brains may be far easier to understand than their resulting properties. The analogy suggests that neuroscience would benefit from a focus on learning and development.

[1]  E. Fetz Operant Conditioning of Cortical Unit Activity , 1969, Science.

[2]  D. Robinson Adaptive gain control of vestibuloocular reflex by the cerebellum. , 1976, Journal of neurophysiology.

[3]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[4]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[5]  R. Douglas,et al.  A functional microcircuit for cat visual cortex. , 1991, The Journal of physiology.

[6]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[7]  M. Dickinson Solving the mystery of insect flight. , 2001, Scientific American.

[8]  Julian D. Olden,et al.  Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[9]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[10]  Olaf Sporns,et al.  Complex network measures of brain connectivity: Uses and interpretations , 2010, NeuroImage.

[11]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[14]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[15]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[16]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[17]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[18]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[19]  Xundong Wu High Performance Binarized Neural Networks trained on the ImageNet Classification Task , 2016, ArXiv.

[20]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[21]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[22]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yong Zhao,et al.  Binarized Neural Networks on the ImageNet Classification Task , 2016 .

[24]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[25]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[26]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[27]  Geoffrey E. Hinton,et al.  Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.

[28]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[29]  Konrad Paul Kording,et al.  Could a Neuroscientist Understand a Microprocessor? , 2016, bioRxiv.

[30]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[31]  Jason Yosinski,et al.  Measuring the Intrinsic Dimension of Objective Landscapes , 2018, ICLR.

[32]  Eric Shea-Brown,et al.  Dynamic compression and expansion in a classifying recurrent network , 2019, bioRxiv.

[33]  Venkatakrishnan Ramaswamy,et al.  An Algorithmic Barrier to Neural Circuit Understanding , 2019, bioRxiv.

[34]  Ryan P. Adams,et al.  Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach , 2018, ICLR.

[35]  Francis Mollica,et al.  Humans store about 1.5 megabytes of information during language acquisition , 2019, Royal Society Open Science.

[36]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .