Engineering a Less Artificial Intelligence

Despite enormous progress in machine learning, artificial neural networks still lag behind brains in their ability to generalize to new situations. Given identical training data, differences in generalization are caused by many defining features of a learning algorithm, such as network architecture and learning rule. Their joint effect, called "inductive bias," determines how well any learning algorithm-or brain-generalizes: robust generalization needs good inductive biases. Artificial networks use rather nonspecific biases and often latch onto patterns that are only informative about the statistics of the training data but may not generalize to different scenarios. Brains, on the other hand, generalize across comparatively drastic changes in the sensory input all the time. We highlight some shortcomings of state-of-the-art learning algorithms compared to biological brains and discuss several ideas about how neuroscience can guide the quest for better inductive biases by providing useful constraints on representations and network architecture.

[1]  K. Naka,et al.  White-Noise Analysis of a Neuron Chain: An Application of the Wiener Theory , 1972, Science.

[2]  Gilles Laurent,et al.  Neural Encoding of Rapidly Fluctuating Odors , 2009, Neuron.

[3]  Matthias Bethge,et al.  Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ming Li,et al.  Convolutional neural network models of V1 responses to complex patterns , 2018, Journal of Computational Neuroscience.

[6]  Esther Mondragón,et al.  Rule Learning by Rats , 2008, Science.

[7]  Alexander S. Ecker,et al.  Inception in visual cortex: in vivo-silico loops reveal most exciting images , 2018 .

[8]  Geoffrey E. Hinton,et al.  Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.

[9]  Bernt Schiele,et al.  DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model , 2016, ECCV.

[10]  Alexander S. Ecker,et al.  A rotation-equivariant convolutional neural network model of primary visual cortex , 2018, ICLR.

[11]  David Marr,et al.  VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[12]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[13]  W. Vaughan,et al.  Formation of equivalence sets in pigeons , 1988 .

[14]  R. Desimone,et al.  Predicting responses of nonlinear neurons in monkey striate cortex to complex patterns , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[15]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[16]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[19]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[20]  Antonio Torralba,et al.  Deep Neural Networks predict Hierarchical Spatio-temporal Cortical Dynamics of Human Visual Object Recognition , 2016, ArXiv.

[21]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[22]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[23]  C. Koch,et al.  Coding of time-varying electric field amplitude modulations in a wave-type electric fish. , 1996, Journal of neurophysiology.

[24]  Matthias Bethge,et al.  Generalisation in humans and deep neural networks , 2018, NeurIPS.

[25]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[26]  Ha Hong,et al.  Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.

[27]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[28]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[29]  Chris Eliasmith,et al.  Function approximation in inhibitory networks , 2016, Neural Networks.

[30]  M. London,et al.  Dendritic computation. , 2005, Annual review of neuroscience.

[31]  Karel Svoboda,et al.  Author response: A large field of view two-photon mesoscope with subcellular resolution for in vivo imaging , 2016 .

[32]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[33]  Jitendra Malik,et al.  Pixels to Voxels: Modeling Visual Representation in the Human Brain , 2014, ArXiv.

[34]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[35]  Matthias Bethge,et al.  Towards the first adversarially robust neural network model on MNIST , 2018, ICLR.

[36]  Leonidas J. Guibas,et al.  Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Surya Ganguli,et al.  Deep Learning Models of the Retinal Response to Natural Scenes , 2017, NIPS.

[38]  H. Markram The Blue Brain Project , 2006, Nature Reviews Neuroscience.

[39]  Anthony M. Zador,et al.  A critique of pure learning and what artificial neural networks can learn from animal brains , 2019, Nature Communications.

[40]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[41]  R. Douglas,et al.  A functional microcircuit for cat visual cortex. , 1991, The Journal of physiology.

[42]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[43]  Kaiming He,et al.  Exploring Randomly Wired Neural Networks for Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[45]  Alexander S. Ecker,et al.  Stimulus domain transfer in recurrent models for large scale cortical population prediction on video , 2018, NeurIPS.

[46]  Daniel S. Margulies,et al.  Three-Dimensional Mean-Shift Edge Bundling for the Visualization of Functional Connectivity in the Brain , 2012, IEEE Transactions on Visualization and Computer Graphics.

[47]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[48]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[49]  Chethan Pandarinath,et al.  Inferring single-trial neural population dynamics using sequential auto-encoders , 2017, Nature Methods.

[50]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[51]  C. Koch,et al.  From stimulus encoding to feature extraction in weakly electric fish , 1996, Nature.

[52]  Geoffrey E. Hinton,et al.  Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures , 2018, NeurIPS.

[53]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[54]  N. Whitman A bitter lesson. , 1999, Academic medicine : journal of the Association of American Medical Colleges.

[55]  K. Svoboda,et al.  A large field of view two-photon mesoscope with subcellular resolution for in vivo imaging , 2016, bioRxiv.

[56]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[57]  Matthias Schroder Mind Children The Future Of Robot And Human Intelligence , 2016 .

[58]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[59]  Chris Eliasmith,et al.  Solving the Problem of Negative Synaptic Weights in Cortical Models , 2008, Neural Computation.

[60]  R. Weale Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[61]  Brian Lau,et al.  Computational subunits of visual cortical neurons revealed by artificial neural networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[62]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[63]  Carlos R. Ponce,et al.  Evolving Images for Visual Neurons Using a Deep Generative Network Reveals Coding Principles and Neuronal Preferences , 2019, Cell.

[64]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[65]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[66]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[67]  Eero P. Simoncelli,et al.  Spatio-temporal correlations and visual signalling in a complete neuronal population , 2008, Nature.

[68]  Fabian A. Soto,et al.  Mechanisms of object recognition: what we have learned from pigeons , 2014, Front. Neural Circuits.

[69]  Sergey L. Gratiy,et al.  Fully integrated silicon probes for high-density recording of neural activity , 2017, Nature.

[70]  E J Chichilnisky,et al.  A simple white noise analysis of neuronal light responses , 2001, Network.

[71]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[72]  Alexander S. Ecker,et al.  Neural system identification for large populations separating "what" and "where" , 2017, NIPS.

[73]  Pouya Bashivan,et al.  Neural Population Control via Deep ANN Image Synthesis , 2018 .

[74]  Charles Kemp,et al.  How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.

[75]  James A. Bednar,et al.  Model Constrained by Visual Hierarchy Improves Prediction of Neural Responses to Natural Scenes , 2016, PLoS Comput. Biol..

[76]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[77]  Matthias Bethge,et al.  Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics , 2017, ECCV.

[78]  Matthias Bethge,et al.  Functional analysis of ultra high information rates conveyed by rat vibrissal primary afferents , 2013, Front. Neural Circuits.

[79]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[80]  Walter J. Scheirer,et al.  Using human brain activity to guide machine learning , 2017, Scientific Reports.

[81]  Konrad P. Körding,et al.  Toward an Integration of Deep Learning and Neuroscience , 2016, bioRxiv.

[82]  Eero P. Simoncelli,et al.  A Convolutional Subunit Model for Neuronal Responses in Macaque V1 , 2015, The Journal of Neuroscience.

[83]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[84]  E I Knudsen,et al.  Center-surround organization of auditory receptive fields in the owl. , 1978, Science.

[85]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[86]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[87]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[88]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[89]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[90]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[91]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Computer Vision and Time-Limited Humans , 2018, NeurIPS.

[92]  Alexander S. Ecker,et al.  Principles of connectivity among morphologically defined cell types in adult neocortex , 2015, Science.

[93]  Elijah D. Christensen,et al.  Using deep learning to probe the neural code for images in primary visual cortex , 2019, Journal of vision.

[94]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[95]  Leon A. Gatys,et al.  Deep convolutional models improve predictions of macaque V1 responses to natural images , 2019, PLoS Comput. Biol..

[96]  Nicholas Carlini,et al.  On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses , 2018, ArXiv.

[97]  Peter V. Gehler,et al.  DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[98]  Kyle J Gerber,et al.  Roles for Regulator of G Protein Signaling Proteins in Synaptic Signaling and Plasticity , 2016, Molecular Pharmacology.

[99]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[100]  B. Richmond,et al.  Monkeys Quickly Learn and Generalize Visual Categories without Lateral Prefrontal Cortex , 2010, Neuron.

[101]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[102]  D. Wolpert,et al.  No Free Lunch Theorems for Search , 1995 .

[103]  Marcel A. J. van Gerven,et al.  Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream , 2014, The Journal of Neuroscience.

[104]  Ryan J. Prenger,et al.  Nonlinear V1 responses to natural scenes revealed by neural network analysis , 2004, Neural Networks.

[105]  Bartlett W. Mel,et al.  Pyramidal Neuron as Two-Layer Neural Network , 2003, Neuron.

[106]  Kevin M. Cury,et al.  DeepLabCut: markerless pose estimation of user-defined body parts with deep learning , 2018, Nature Neuroscience.

[107]  Sarah M. N. Woolley,et al.  A Generalized Linear Model for Estimating Spectrotemporal Receptive Fields from Responses to Natural Sounds , 2011, PloS one.