Cognitive Control

This paper is inspired by how cognitive control manifests itself in the human brain and does so in a remarkable way. It addresses the many facets involved in the control of directed information flow in a dynamic system, culminating in the notion of information gap, defined as the difference between relevant information (useful part of what is extracted from the incoming measurements) and sufficient information representing the information needed for achieving minimal risk. The notion of information gap leads naturally to how cognitive control can itself be defined. Then, another important idea is described, namely the two-state model, in which one is the system's state and the other is the entropic state that provides an essential metric for quantifying the information gap. The entropic state is computed in the perceptual part (i.e., perceptor) of the dynamic system and sent to the controller directly as feedback information. This feedback information provides the cognitive controller the information needed about the environment and the system to bring reinforcement leaning into play; reinforcement learning (RL), incorporating planning as an integral part, is at the very heart of cognitive control. The stage is now set for a computational experiment, involving cognitive radar wherein the cognitive controller is enabled to control the receiver via the environment. The experiment demonstrates how RL provides the mechanism for improved utilization of computational resources, and yet is able to deliver good performance through the use of planning. The paper finishes with concluding remarks.

[1]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .

[2]  Norbert Wiener,et al.  The human use of human beings - cybernetics and society , 1988 .

[3]  R Bellman,et al.  DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[4]  D. Spence,et al.  Cognitive Control: A Study of Individual Consistencies in Cognitive Behavior , 1959 .

[5]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[6]  Norbert Wiener,et al.  Cybernetics, or control and communication in the animal and the machine, 2nd ed. , 1961 .

[7]  Y. Ho,et al.  A Bayesian approach to problems in stochastic estimation and control , 1964 .

[8]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[9]  A. H. Klopf,et al.  Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .

[10]  Sylvia Weir,et al.  Action perception , 1974 .

[11]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[12]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[13]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[14]  Anuradha M. Annaswamy,et al.  Robust Adaptive Control , 1984, 1984 American Control Conference.

[15]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[16]  S. Sastry,et al.  Adaptive Control: Stability, Convergence and Robustness , 1989 .

[17]  Bernard Widrow,et al.  30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[18]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[19]  Terrence J. Sejnowski,et al.  Using Aperiodic Reinforcement for Directed Self-Organization During Development , 1992, NIPS.

[20]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[21]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[22]  O. Hikosaka Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.

[23]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[24]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[25]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[26]  Peter Dayan,et al.  Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.

[27]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[28]  Tomas Hrycej,et al.  Neurocontrol: Towards an Industrial Control Methodology , 1997 .

[29]  Rajesh P. N. Rao,et al.  Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex , 1997, Neural Computation.

[30]  V. Mountcastle Perceptual Neuroscience: The Cerebral Cortex , 1998 .

[31]  Frank L. Lewis,et al.  Neural Network Control Of Robot Manipulators And Non-Linear Systems , 1998 .

[32]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[33]  Jacob Feldman,et al.  Minimization of Boolean complexity in human concept learning , 2000, Nature.

[34]  Hugh F. Durrant-Whyte,et al.  A new method for the nonlinear transformation of means and covariances in filters and estimators , 2000, IEEE Trans. Autom. Control..

[35]  J. Dupuy,et al.  On the Origins of Cognitive Science: The Mechanization of Mind , 2000, History & Philosophy of Psychology.

[36]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[37]  Niels Kjølstad Poulsen,et al.  Neural Networks for Modelling and Control of Dynamic Systems: A Practitioner’s Handbook , 2000 .

[38]  Peter A. Corning,et al.  “Control information”: The missing element in Norbert Wiener’s cybernetic paradigm? , 2001 .

[39]  Thiagalingam Kirubarajan,et al.  Estimation with Applications to Tracking and Navigation , 2001 .

[40]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[41]  Lee A. Feldkamp,et al.  Parameter‐Based Kalman Filter Training: Theory and Implementation , 2002 .

[42]  K. Jellinger Cortex and Mind. Unifying Cognition , 2003 .

[43]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[44]  Branko Ristic,et al.  Beyond the Kalman Filter: Particle Filters for Tracking Applications , 2004 .

[45]  M. Sigman Bridging Psychology and Mathematics: Can the Brain Understand the Brain? , 2004, PLoS biology.

[46]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[47]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[48]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[49]  Peter Dayan,et al.  How fast to work: Response vigor, motivation and tonic dopamine , 2005, NIPS.

[50]  M. Brass,et al.  The role of the inferior frontal junction area in cognitive control , 2005, Trends in Cognitive Sciences.

[51]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[52]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[53]  P. Glimcher,et al.  Statistics of midbrain dopamine neuron spike trains in the awake primate. , 2007, Journal of neurophysiology.

[54]  Simon Haykin,et al.  Cognitive Dynamic Systems , 2006, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[55]  M. Roesch,et al.  Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[56]  Ohad Shamir,et al.  Learning and generalization with the information bottleneck , 2008, Theor. Comput. Sci..

[57]  P. Dayan,et al.  Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[58]  Stefano Soatto,et al.  Actionable information in vision , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[59]  Y. Niv Reinforcement learning in the brain , 2009 .

[60]  Joshua L. Plotkin,et al.  Dopamine and synaptic plasticity in dorsal striatal circuits controlling action selection , 2009, Current Opinion in Neurobiology.

[61]  S. Haykin,et al.  Cubature Kalman Filters , 2009, IEEE Transactions on Automatic Control.

[62]  E. Koechlin,et al.  Motivation and cognitive control in the human prefrontal cortex , 2009, Nature Neuroscience.

[63]  A. N. Shiryayev,et al.  Selected Works of A.N. Kolmogorov: Volume III Information Theory and the Theory of Algorithms , 2010 .

[64]  Karl J. Friston,et al.  Attention, Uncertainty, and Free-Energy , 2010, Front. Hum. Neurosci..

[65]  Ohad Shamir,et al.  Learning and generalization with the information bottleneck , 2008, Theoretical Computer Science.

[66]  S. Shreve Stochastic Calculus for Finance II: Continuous-Time Models , 2010 .

[67]  Joshua W. Brown,et al.  Medial prefrontal cortex as an action-outcome predictor , 2011, Nature Neuroscience.

[68]  Simon Haykin,et al.  Control theoretic approach to tracking radar: First step towards cognition , 2011, Digit. Signal Process..

[69]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .

[70]  Rogelio Lozano,et al.  Adaptive Control: Algorithms, Analysis and Applications , 2011 .

[71]  Simon Haykin,et al.  Cognitive Dynamic Systems: Radar, Control, and Radio [Point of View] , 2012, Proc. IEEE.

[72]  Simon Haykin,et al.  Cognitive Radar: Step Toward Bridging the Gap Between Neuroscience and Engineering , 2012, Proceedings of the IEEE.

[73]  P. Zelazo,et al.  What is Cognitive Control , 2013 .