论文信息 - Reinforcement Learning of Linking and Tracing Contours in Recurrent Neural Networks

Reinforcement Learning of Linking and Tracing Contours in Recurrent Neural Networks

The processing of a visual stimulus can be subdivided into a number of stages. Upon stimulus presentation there is an early phase of feedforward processing where the visual information is propagated from lower to higher visual areas for the extraction of basic and complex stimulus features. This is followed by a later phase where horizontal connections within areas and feedback connections from higher areas back to lower areas come into play. In this later phase, image elements that are behaviorally relevant are grouped by Gestalt grouping rules and are labeled in the cortex with enhanced neuronal activity (object-based attention in psychology). Recent neurophysiological studies revealed that reward-based learning influences these recurrent grouping processes, but it is not well understood how rewards train recurrent circuits for perceptual organization. This paper examines the mechanisms for reward-based learning of new grouping rules. We derive a learning rule that can explain how rewards influence the information flow through feedforward, horizontal and feedback connections. We illustrate the efficiency with two tasks that have been used to study the neuronal correlates of perceptual organization in early visual cortex. The first task is called contour-integration and demands the integration of collinear contour elements into an elongated curve. We show how reward-based learning causes an enhancement of the representation of the to-be-grouped elements at early levels of a recurrent neural network, just as is observed in the visual cortex of monkeys. The second task is curve-tracing where the aim is to determine the endpoint of an elongated curve composed of connected image elements. If trained with the new learning rule, neural networks learn to propagate enhanced activity over the curve, in accordance with neurophysiological data. We close the paper with a number of model predictions that can be tested in future neurophysiological and computational studies.

[1] P. Roelfsema. Elemental operations in vision , 2005, Trends in Cognitive Sciences.

[2] Kae Nakamura,et al. Basal ganglia orient eyes to reward. , 2006, Journal of neurophysiology.

[3] Bryan M Hooks,et al. Distinct Balance of Excitation and Inhibition in an Interareal Feedforward and Feedback Circuit of Mouse Visual Cortex , 2013, The Journal of Neuroscience.

[4] Pieter R Roelfsema,et al. Belief states as a framework to explain extra-retinal influences in visual cortex , 2015, Current Opinion in Neurobiology.

[5] G. Orban,et al. Practising orientation identification improves orientation coding in V1 neurons , 2001, Nature.

[6] Arash Yazdanbakhsh,et al. Seeing surfaces: The brain's vision of the world , 2007 .

[7] H. Spekreijse,et al. A gradual spread of attention during mental curve tracing. , 2003, Perception & psychophysics.

[8] Karel Svoboda,et al. Long-Range Neuronal Circuits Underlying the Interaction between Sensory and Motor Cortex , 2011, Neuron.

[9] H. Neumann,et al. The Role of Attention in Figure-Ground Segregation in Areas V1 and V4 of the Visual Cortex , 2012, Neuron.

[10] Marvin Minsky,et al. Perceptrons: An Introduction to Computational Geometry , 1969 .

[11] Pieter R. Roelfsema,et al. The Representation of Erroneously Perceived Stimuli in the Primary Visual Cortex , 2001, Neuron.

[12] H. Deubel,et al. Visual attention during the preparation of bimanual movements , 2008, Vision Research.

[13] O. Parodi,et al. Neuronal Information Processing , 1999 .

[14] M. Bear,et al. LTP and LTD An Embarrassment of Riches , 2004, Neuron.

[15] Friedhelm Schwenker,et al. Attention-Gated Reinforcement Learning in Neural Networks - A Unified View , 2013, ICANN.

[16] Randall C. O'Reilly,et al. Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm , 1996, Neural Computation.

[17] D. Pollen,et al. Striate cortex increases contrast gain of macaque LGN neurons , 2000, Visual Neuroscience.

[18] Pieter R. Roelfsema,et al. Different Processing Phases for Features, Figures, and Selective Attention in the Primary Visual Cortex , 2007, Neuron.

[19] S. Grossberg,et al. A neural model of how horizontal and interlaminar connections of visual cortex develop into adult circuits that carry out perceptual grouping and learning. , 2010, Cerebral cortex.

[20] Walter Senn,et al. Spatio-Temporal Credit Assignment in Neuronal Population Learning , 2011, PLoS Comput. Biol..

[21] Jill Cousins,et al. A network model , 2012 .

[22] Stefan C. Kremer,et al. Recurrent Neural Networks , 2013, Handbook on Neural Information Processing.

[23] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.

[24] G. Logan,et al. Object-based attention in Chinese readers of Chinese words: Beyond Gestalt principles , 2008, Psychonomic bulletin & review.

[25] Pieter R. Roelfsema,et al. Attention-Gated Reinforcement Learning of Internal Representations for Classification , 2005, Neural Computation.

[26] S. Grossberg,et al. Contrast-sensitive perceptual grouping and object-based attention in the laminar circuits of primary visual cortex , 2000, Vision Research.

[27] Yiannis Aloimonos,et al. Active Segmentation , 2009, Int. J. Humanoid Robotics.

[28] L. B. Almeida,et al. BACKPROPAGATION IN PERCEPTRONS WITH FEEDBACK , 2022 .

[29] C. Pennartz,et al. A unified selection signal for attention and reward in primary visual cortex , 2013, Proceedings of the National Academy of Sciences.

[30] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.

[31] J. Schall,et al. Neural selection and control of visually guided eye movements. , 1999, Annual review of neuroscience.

[32] Heiko Neumann,et al. A Model of Motion Transparency Processing with Local Center-Surround Interactions and Feedback , 2011, Neural Computation.

[33] Henry Markram,et al. Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[34] S. Hochstein,et al. The reverse hierarchy theory of visual perceptual learning , 2004, Trends in Cognitive Sciences.

[35] Stephen Grossberg,et al. Absolute stability of global pattern formation and parallel memory storage by competitive neural networks , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[36] M. Carandini,et al. The Suppressive Field of Neurons in Lateral Geniculate Nucleus , 2005, The Journal of Neuroscience.

[37] Leslie M. Loew,et al. Computational neurobiology is a useful tool in translational neurology: the example of ataxia , 2014, Front. Neurosci..

[38] Tobias Brosch,et al. Computing with a Canonical Neural Circuits Model with Pool Normalization and Modulating Feedback , 2014, Neural Computation.

[39] M. Carandini,et al. Inhibition dominates sensory responses in awake cortex , 2012, Nature.

[40] John K. Tsotsos. The Selective Tuning Model for Visual Attention , 2002 .

[41] Pieter R Roelfsema,et al. Neuronal Activity in the Visual Cortex Reveals the Temporal Order of Cognitive Operations , 2010, The Journal of Neuroscience.

[42] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.

[43] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[44] Pineda,et al. Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[45] Simon J. Thorpe,et al. Ultra-rapid object detection with saccadic eye movements: Visual processing speed revisited , 2006, Vision Research.

[46] Daniel B. Vatterott,et al. Visual statistical learning can drive object-based attentional selection , 2014, Attention, perception & psychophysics.

[47] Richard T Born,et al. Corticocortical Feedback Contributes to Surround Suppression in V1 of the Alert Primate , 2013, The Journal of Neuroscience.

[48] John K. Tsotsos,et al. The selective tuning model of attention: psychophysical evidence for a suppressive annulus around an attended item , 2003, Vision Research.

[49] Heiko Neumann,et al. Sketching shiny surfaces: 3D shape extraction and depiction of specular surfaces , 2006, TAP.

[50] C. Gilbert,et al. Learning to Link Visual Contours , 2008, Neuron.

[51] J.J. Steil,et al. Backpropagation-decorrelation: online recurrent learning with O(N) complexity , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[52] Luís B. Almeida,et al. A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[53] Nicholas A. Steinmetz,et al. Eye Movement Preparation Modulates Neuronal Responses in Area V4 When Dissociated from Attentional Demands , 2014, Neuron.

[54] M. Farah,et al. Is visual image segmentation a bottom-up or an interactive process? , 1997, Perception & psychophysics.

[55] Takeo Watanabe,et al. Perceptual learning rules based on reinforcers and attention , 2010, Trends in Cognitive Sciences.

[56] W. Schultz. Getting Formal with Dopamine and Reward , 2002, Neuron.

[57] Benjamin Schrauwen,et al. An overview of reservoir computing: theory, applications and implementations , 2007, ESANN.

[58] J. Cowan,et al. A mathematical theory of the functional dynamics of cortical and thalamic nervous tissue , 1973, Kybernetik.

[59] Henry Kennedy,et al. The importance of being hierarchical , 2013, Current Opinion in Neurobiology.

[60] Si Wu,et al. Perceptual training continuously refines neuronal population codes in primary visual cortex , 2014, Nature Neuroscience.

[61] Michael W. Spratling. A single functional model of drivers and modulators in cortex , 2013, Journal of Computational Neuroscience.

[62] Minmin Luo,et al. Dorsal Raphe Neurons Signal Reward through 5-HT and Glutamate , 2014, Neuron.

[63] Aapo Hyvärinen,et al. Statistical Models of Natural Images and Cortical Visual Representation , 2010, Top. Cogn. Sci..

[64] Wulfram Gerstner,et al. Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail , 2009, PLoS Comput. Biol..

[65] Ohad Ben-Shahar,et al. Geometrical Computations Explain Projection Patterns of Long-Range Horizontal Connections in Visual Cortex , 2004, Neural Computation.

[66] Alexander Graham,et al. Kronecker Products and Matrix Calculus: With Applications , 1981 .

[67] A. Burkhalter,et al. Different Balance of Excitation and Inhibition in Forward and Feedback Circuits of Rat Visual Cortex , 1996, The Journal of Neuroscience.

[68] Heiko Neumann,et al. Disambiguating Visual Motion Through Contextual Feedback Modulation , 2004, Neural Computation.

[69] Henry Kennedy,et al. Cortical High-Density Counterstream Architectures , 2013, Science.

[70] P. Glimcher,et al. Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[71] C. Gilbert,et al. Contour Saliency in Primary Visual Cortex , 2006, Neuron.

[72] Tobias Brosch,et al. Interaction of feedforward and feedback streams in visual cortex in a firing-rate model of columnar computations , 2014, Neural Networks.

[73] Lisa R. Betts,et al. Distributed Neural Plasticity for Shape Learning in the Human Visual Cortex , 2005, PLoS biology.

[74] W. Schultz. Multiple dopamine functions at different time courses. , 2007, Annual review of neuroscience.

[75] M. Fahle. Perceptual learning: specificity versus generalization , 2005, Current Opinion in Neurobiology.

[76] A. Treisman,et al. A feature-integration theory of attention , 1980, Cognitive Psychology.

[77] Tobias Brosch,et al. On event-based optical flow detection , 2015, Front. Neurosci..

[78] S. Hochstein,et al. Reverse hierarchies and sensory learning , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[79] W. Senn,et al. Top-down dendritic input increases the gain of layer 5 pyramidal neurons. , 2004, Cerebral cortex.

[80] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[81] Zhaoping Li,et al. A Neural Model of Contour Integration in the Primary Visual Cortex , 1998, Neural Computation.

[82] Danique Jeurissen,et al. The Time Course of Perceptual Grouping in Natural Scenes , 2012, Psychological science.

[83] S Ullman,et al. Sequence seeking and counter streams: a computational model for bidirectional information flow in the visual cortex. , 1995, Cerebral cortex.

[84] Frank E. Ritter,et al. The Rise of Cognitive Architectures , 2007, Integrated Models of Cognitive Systems.

[85] Heiko Neumann,et al. A neural model of the temporal dynamics of figure-ground segregation in motion perception , 2010, Neural Networks.

[86] Denis Fize,et al. Speed of processing in the human visual system , 1996, Nature.

[87] P. Glimcher. Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis , 2011, Proceedings of the National Academy of Sciences.

[88] W. Levick,et al. Lateral geniculate neurons of cat: retinal inputs and physiology. , 1972, Investigative ophthalmology.

[89] Harald Haas,et al. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[90] S. Sherman,et al. Modulatory Effects of Metabotropic Glutamate Receptors on Local Cortical Circuits , 2012, The Journal of Neuroscience.

[91] S Ullman,et al. Visual curve tracing properties. , 1991, Journal of experimental psychology. Human perception and performance.

[92] Pieter R. Roelfsema,et al. Neurally Plausible Reinforcement Learning of Working Memory Tasks , 2012, NIPS.

[93] E. Vaadia,et al. Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[94] Pierre Kornprobst,et al. Neural Mechanisms of Motion Detection, Integration, and Segregation: From Biology to Artificial Image Processing Systems , 2011, EURASIP J. Adv. Signal Process..

[95] H. Neumann,et al. Extraction of Surface-Related Features in a Recurrent Model of V1-V2 Interactions , 2009, PloS one.

[96] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[97] J. Cowan,et al. Excitatory and inhibitory interactions in localized populations of model neurons. , 1972, Biophysical journal.

[98] S. Sherman,et al. Synaptic Properties of Corticocortical Connections between the Primary and Secondary Visual Cortical Areas in the Mouse , 2011, The Journal of Neuroscience.

[99] Pieter R. Roelfsema,et al. How Attention Can Create Synaptic Tags for the Learning of Working Memories in Sequential Tasks , 2015, PLoS Comput. Biol..

[100] R. O’Reilly,et al. Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain , 2000 .

[101] Pieter R. Roelfsema,et al. ALGORITHMS FOR THE DETECTION OF CONNECTEDNESS AND THEIR NEURAL IMPLEMENTATION , 2001 .

[102] R. Andersen,et al. Multimodal representation of space in the posterior parietal cortex and its use in planning movements. , 1997, Annual review of neuroscience.

[103] Francis Crick,et al. The recent excitement about neural networks , 1989, Nature.

[104] Pieter R Roelfsema,et al. Subtask sequencing in the primary visual cortex , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[105] Jonathan D. Cohen,et al. Computational roles for dopamine in behavioural control , 2004, Nature.

[106] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[107] Tobias Brosch,et al. The Brain's Sequential Parallelism: Perceptual Decision-Making and Early Sensory Responses , 2012, ICONIP.

[108] H. Deubel,et al. Saccade target selection and object recognition: Evidence for a common attentional mechanism , 1996, Vision Research.

[109] B. Dosher,et al. The dynamics of perceptual learning: an incremental reweighting model. , 2005, Psychological review.

[110] Herbert Jaeger,et al. Echo State Property Linked to an Input: Exploring a Fundamental Characteristic of Recurrent Neural Networks , 2013, Neural Computation.

[111] John K. Tsotsos,et al. Neurobiology of Attention , 2005 .

[112] Jeremy M. Wolfe,et al. Guided Search 4.0: Current Progress With a Model of Visual Search , 2007, Integrated Models of Cognitive Systems.

[113] S. Hochstein,et al. Attentional control of early perceptual learning. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[114] S. Sherman,et al. Synaptic properties of connections between the primary and secondary auditory cortices in mice. , 2011, Cerebral cortex.

[115] T. Soderling,et al. Regulatory mechanisms of AMPA receptors in synaptic plasticity , 2007, Nature Reviews Neuroscience.

[116] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[117] R. O’Reilly. Six principles for biologically based computational models of cortical cognition , 1998, Trends in Cognitive Sciences.

[118] P Jolicoeur,et al. Size invariance in curve tracing , 1991, Memory & cognition.

[119] John K. Tsotsos,et al. Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[120] R. Guillery,et al. On the actions that one nerve cell can have on another: distinguishing "drivers" from "modulators". , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[121] W. Singer,et al. Detecting connectedness. , 1998, Cerebral cortex.

[122] Igor Aleksander,et al. Backpropagation in non-feedforward networks , 2003 .

[123] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..

[124] L. B. Lmeida. Backpropagation in perceptrons with feedback , 1988 .

[125] S Grossberg,et al. Neural dynamics of perceptual grouping: Textures, boundaries, and emergent segmentations , 1985, Perception & psychophysics.

[126] Pieter R. Roelfsema,et al. A Growth-Cone Model for the Spread of Object-Based Attention during Contour Grouping , 2014, Current Biology.

[127] James R. Bergen,et al. Parallel versus serial processing in rapid pattern discrimination , 1983, Nature.

[128] M. Chun,et al. Selective attention modulates implicit learning , 2001, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[129] M. Kilgard,et al. Cortical map reorganization enabled by nucleus basalis activity. , 1998, Science.

[130] S. Ullman. Visual routines , 1984, Cognition.

[131] Hualou Liang,et al. Incremental Integration of Global Contours through Interplay between Visual Cortical Areas , 2014, Neuron.

[132] J. Maunsell,et al. The Effect of Perceptual Learning on Neuronal Responses in Monkey Visual Area V4 , 2004, The Journal of Neuroscience.

[133] G. Reeke,et al. Network model of top-down influences on local gain and contextual interactions in visual cortex , 2013, Proceedings of the National Academy of Sciences.

[134] Ilya Nemenman,et al. Model Cortical Association Fields Account for the Time Course and Dependence on Target Complexity of Human Contour Perception , 2011, PLoS Comput. Biol..

[135] Z L Lu,et al. Perceptual learning reflects external noise filtering and internal noise reduction through channel reweighting. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[136] P. Cavanagh,et al. Effect of surface medium on visual search for orientation and size features. , 1990, Journal of experimental psychology. Human perception and performance.

[137] T. Sejnowski,et al. A selection model for motion processing in area MT of primates , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[138] P. Roelfsema,et al. Simultaneous selection by object-based attention in visual and frontal cortex , 2014, Proceedings of the National Academy of Sciences.

[139] Z Li,et al. Contextual influences in V1 as a basis for pop out and asymmetry in visual search. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[140] P. Roelfsema. Cortical algorithms for perceptual grouping. , 2006, Annual review of neuroscience.

[141] D. J. Felleman,et al. Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[142] Igor Aleksander,et al. Neural computing architectures: the design of brain-like machines , 1989 .

[143] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[144] Robert Kozma,et al. Beyond Feedforward Models Trained by Backpropagation: A Practical Training Tool for a More Efficient Universal Approximator , 2007, IEEE Transactions on Neural Networks.

[145] Shimon Ullman,et al. Structural Saliency: The Detection Of Globally Salient Structures using A Locally Connected Network , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[146] Pieter R Roelfsema,et al. PII: S0042-6989(98)00222-3 , 1998 .

[147] Jay Hegdé. Search for the Neural Correlates of Learning to Discriminate Orientations , 2006, The Journal of Neuroscience.

[148] 栁下祥. A critical time window for dopamine actions on the structural plasticity of dendritic spines , 2016 .

[149] Nikola T. Markov,et al. Anatomy of hierarchy: Feedforward and feedback pathways in macaque visual cortex , 2013, The Journal of comparative neurology.

[150] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.