Deep Networks with Internal Selective Attention through Feedback Connections

Traditional convolutional neural networks (CNN) are stationary and feedforward. They neither change their parameters during evaluation nor use feedback from higher to lower layers. Real brains, however, do. So does our Deep Attention Selective Network (dasNet) architecture. DasNets feedback structure can dynamically alter its convolutional filter sensitivities during classification. It harnesses the power of sequential processing to improve classification performance, by allowing the network to iteratively focus its internal attention on some of its convolutional filters. Feedback is trained through direct policy search in a huge million-dimensional parameter space, through scalable natural evolution strategies (SNES). On the CIFAR-10 and CIFAR-100 datasets, dasNet outperforms the previous state-of-the-art model.

[1]  D. Gabor,et al.  Theory of communication. Part 1: The analysis of information , 1946 .

[2]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[3]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[4]  H. Egeth,et al.  Perceptual selectivity is task dependent: The pop-out effect poops out , 1979, Perception & psychophysics.

[5]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[6]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[7]  Jürgen Schmidhuber,et al.  Learning to Generate Artificial Fovea Trajectories for Target Detection , 1991, Int. J. Neural Syst..

[8]  Narendra Ahuja,et al.  Cresceptron: a self-organizing neural network which grows adaptively , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[9]  Steven Douglas Whitehead,et al.  Reinforcement learning for the adaptive control of perception and action , 1992 .

[10]  C. Koch,et al.  Recurrent excitation in neocortical circuits , 1995, Science.

[11]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[12]  Randall C. O'Reilly,et al.  Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm , 1996, Neural Computation.

[13]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[14]  J. M. Hupé,et al.  Cortical feedback improves discrimination between figure and background by V1, V2 and V3 neurons , 1998, Nature.

[15]  V. Lamme,et al.  The distinct modes of vision offered by feedforward and recurrent processing , 2000, Trends in Neurosciences.

[16]  Sven Behnke,et al.  Learning Iterative Image Reconstruction in the Neural Abstraction Pyramid , 2001, Int. J. Comput. Intell. Appl..

[17]  Victor A. F. Lamme Blindsight: the role of feedforward and feedback corticocortical connections. , 2001, Acta psychologica.

[18]  J. Bullier,et al.  The role of feedback connections in shaping the responses of visual cortical neurons. , 2001, Progress in brain research.

[19]  Kunihiko Fukushima,et al.  Restoring Partly Occluded Patterns: A Neural Network Model with Backward Paths , 2003, ICANN.

[20]  J. Bullier Hierarchies of Cortical Areas , 2003 .

[21]  Sven Behnke,et al.  Face localization and tracking in the neural abstraction pyramid , 2005, Neural Computing & Applications.

[22]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[23]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[24]  E. Halgren,et al.  Top-down facilitation of visual recognition. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[25]  R. VanRullen On second glance: Still no high-level pop-out effect for faces , 2006, Vision Research.

[26]  Laurent Itti,et al.  Visual salience , 2007, Scholarpedia.

[27]  C. Gilbert,et al.  Brain States: Top-Down Influences in Sensory Processing , 2007, Neuron.

[28]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[30]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[31]  Geoffrey E. Hinton,et al.  Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.

[32]  Tom Schaul,et al.  Exponential natural evolution strategies , 2010, GECCO '10.

[33]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[34]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[35]  Pietro Perona,et al.  Visual Recognition with Humans in the Loop , 2010, ECCV.

[36]  Luca Maria Gambardella,et al.  Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.

[37]  M. A. Wiering,et al.  Using Guided Autoencoders on Face Recognition , 2011 .

[38]  Tom Schaul,et al.  High dimensions and heavy tails for natural evolution strategies , 2011, GECCO '11.

[39]  Graham W. Taylor,et al.  Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.

[40]  Luca Maria Gambardella,et al.  Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images , 2012, NIPS.

[41]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[42]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  W. Marsden I and J , 2012 .

[44]  Itamar Arel,et al.  Reinforcement learning based visual attention with application to face detection , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[45]  Seth A. Herd,et al.  The Role of Competitive Inhibition and Top-Down Feedback in Binding during Object Recognition , 2012, Front. Psychology.

[46]  Tim Curran,et al.  The Limits of Feedforward Vision: Recurrent Processing Promotes Robust Object Recognition when Objects Are Degraded , 2012, Journal of Cognitive Neuroscience.

[47]  Misha Denil,et al.  Learning Where to Attend with Deep Architectures for Image Tracking , 2011, Neural Computation.

[48]  Nando de Freitas,et al.  A Machine Learning Perspective on Predictive Coding with PAQ8 , 2011, 2012 Data Compression Conference.

[49]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[50]  Rob Fergus,et al.  Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.

[51]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[53]  Luca Maria Gambardella,et al.  Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.

[54]  David J. Jilk,et al.  Recurrent Processing during Object Recognition , 2011, Front. Psychol..

[55]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[56]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[58]  Jürgen Schmidhuber,et al.  Compete to Compute , 2013, NIPS.

[59]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[60]  R. Fergus,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[61]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[62]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[63]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.