Coherent Infomax as a Computational Goal for Neural Systems

Signal processing in the cerebral cortex is thought to involve a common multi-purpose algorithm embodied in a canonical cortical micro-circuit that is replicated many times over both within and across cortical regions. Operation of this algorithm produces widely distributed but coherent and relevant patterns of activity. The theory of Coherent Infomax provides a formal specification of the objectives of such an algorithm. It also formally derives specifications for both the short-term processing dynamics and for the learning rules whereby the connection strengths between units in the network can be adapted to the environment in which the system finds itself. A central assumption of the theory is that the local processors can combine reliable signal coding with flexible use of those codes because they have two classes of synaptic connection: driving connections which specify the information content of the neural signals, and contextual connections which modulate that signal processing. Here, we make the biological relevance of this theory more explicit by putting more emphasis upon the contextual guidance of ongoing processing, by showing that Coherent Infomax is consistent with a particular Bayesian interpretation for the contextual guidance of learning and processing, by explicitly specifying rules for on-line learning, and by suggesting approximations by which the learning rules can be made computationally feasible within systems composed of very many local processors.

[1]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[2]  A. Zador Impact of synaptic unreliability on the information transmitted by spiking neurons. , 1998, Journal of neurophysiology.

[3]  Yves Chauvin,et al.  Backpropagation: theory, architectures, and applications , 1995 .

[4]  Stefano Panzeri,et al.  The Upward Bias in Measures of Information Derived from Limited Data Samples , 1995, Neural Computation.

[5]  Dario Floreano,et al.  Contextually guided unsupervised learning using local multivariate binary processors , 1998, Neural Networks.

[6]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[7]  Michael DeWeese,et al.  Optimization Principles for the Neural Code , 1995, NIPS.

[8]  Suzanna Becker,et al.  Mutual information maximization: models of cortical self-organization. , 1996, Network.

[9]  S. Kullback,et al.  The Information in Contingency Tables , 1980 .

[10]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[11]  C. L. Chapman,et al.  Toward an integrated continuum model of cerebral dynamics: the cerebral rhythms, synchronous oscillation and cortical stability. , 2001, Bio Systems.

[12]  P. Dayan,et al.  Space and time in visual context , 2007, Nature Reviews Neuroscience.

[13]  B J Craven,et al.  Interactions between coincident and orthogonal cues to texture boundaries , 2000, Perception & psychophysics.

[14]  D. G. Johnson,et al.  The Role and Effectiveness of Theories of Decision in Practice , 1977 .

[15]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[16]  Rajesh P. N. Rao,et al.  Bayesian brain : probabilistic approaches to neural coding , 2006 .

[17]  W. Singer,et al.  In search of common foundations for cortical computation , 1997, Behavioral and Brain Sciences.

[18]  Christopher T. Kello,et al.  The emergent coordination of cognitive function. , 2007, Journal of experimental psychology. General.

[19]  J Kay,et al.  Measures for investigating the contextual modulation of information transmission. , 1996, Network.

[20]  W. Singer,et al.  Different voltage-dependent thresholds for inducing long-term depression and long-term potentiation in slices of rat visual cortex , 1990, Nature.

[21]  R. Traub,et al.  Region-specific changes in gamma and beta2 rhythms in NMDA receptor dysfunction models of schizophrenia. , 2008, Schizophrenia bulletin.

[22]  A. Norman Redlich,et al.  Redundancy Reduction as a Strategy for Unsupervised Learning , 1993, Neural Computation.

[23]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[24]  Michael W. Spratling,et al.  A feedback model of perceptual learning and categorization , 2006, Visual Cognition.

[25]  V. Lamme,et al.  The distinct modes of vision offered by feedforward and recurrent processing , 2000, Trends in Neurosciences.

[26]  Mark D. Plumbley,et al.  Information Theory and Neural Networks , 1993 .

[27]  Jim Kay,et al.  Neural networks for unsupervised learning based on information theory , 2000 .

[28]  Felix Creutzig,et al.  Predictive Coding and the Slowness Principle: An Information-Theoretic Approach , 2008, Neural Computation.

[29]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[30]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[31]  Gal Chechik,et al.  Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..

[32]  M. Tsukada,et al.  Stochastic automaton models for the temporal pattern discrimination of nerve impulse sequences , 1976, Biological Cybernetics.

[33]  Barry J. Richmond,et al.  Unbiased measures of transmitted information and channel capacity from multivariate neuronal data , 1991, Biological Cybernetics.

[34]  S. Finger Origins of Neuroscience , 1994 .

[35]  Michael W. Spratling Predictive coding as a model of biased competition in visual attention , 2008, Vision Research.

[36]  D. M. Titterington,et al.  Statistics and Neural Networks , 2000, Technometrics.

[37]  D. Lewis,et al.  Cortical inhibitory neurons and schizophrenia , 2005, Nature Reviews Neuroscience.

[38]  Karl J. Friston Learning and inference in the brain , 2003, Neural Networks.

[39]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[40]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[41]  Joseph J Atick,et al.  Could information theory provide an ecological theory of sensory processing? , 2011, Network.

[42]  S. Becker Jpmax: Learning to Recognize Moving Objects as a Model--tting Problem 1 Learning Coherent Classifications , 1995 .

[43]  Helen Suzanna Becker,et al.  An information-theoretic unsupervised learning algorithm for neural networks , 1993 .

[44]  Jim Kay,et al.  Activation Functions, Computational Goals, and Learning Rules for Local Processors with Contextual Guidance , 1997, Neural Computation.

[45]  J. Aitchison,et al.  Principles, practice and performance in decision-making in clinical medicine , 1975 .

[46]  T. Sejnowski,et al.  Book Review: Gain Modulation in the Central Nervous System: Where Behavior, Neurophysiology, and Computation Meet , 2001, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[47]  Konrad Paul Kording,et al.  Learning with two sites of synaptic integration , 2000, Network.

[48]  M. Tsukada,et al.  Temporal pattern discrimination of impulse sequences in the computer-simulated nerve cells , 2004, Biological Cybernetics.

[49]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[50]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[51]  Konrad Paul Kording,et al.  Bayesian integration in sensorimotor learning , 2004, Nature.

[52]  Richard W. Hamming,et al.  Coding and Information Theory , 1980 .

[53]  Ernst Strüngmann Forum,et al.  Dynamic coordination in the brain : from neurons to mind , 2010 .

[54]  T. Sanger A Probability Interpretation of Neural Population Coding for Movement , 1997 .

[55]  Ralph Linsker,et al.  Local Synaptic Learning Rules Suffice to Maximize Mutual Information in a Linear Network , 1992, Neural Computation.

[56]  Jorge V. José,et al.  Inhibitory synchrony as a mechanism for attentional gain modulation , 2004, Journal of Physiology-Paris.

[57]  Suzanna Becker,et al.  Learning to Categorize Objects Using Temporal Coherence , 1992, NIPS.

[58]  G. V. van Orden,et al.  Dispersion of response times reveals cognitive dynamics. , 2009, Psychological review.

[59]  R. Guillery,et al.  On the actions that one nerve cell can have on another: distinguishing "drivers" from "modulators". , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[60]  W. A. Phillips,et al.  Where the rubber meets the road: The importance of implementation , 2003, Behavioral and Brain Sciences.

[61]  Jim Kay,et al.  The discovery of structure by multi-stream networks of local processors with contextual guidance , 1995 .

[62]  G. Edelman,et al.  A measure for brain complexity: relating functional segregation and integration in the nervous system. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[63]  W. R. Garner Applications of Information Theory to Psychology , 1959 .

[64]  J. G. Taylor,et al.  Mathematical Approaches to Neural Networks , 1993 .

[65]  Miles A Whittington,et al.  Interneuron Diversity series: Inhibitory interneurons and network oscillations in vitro , 2003, Trends in Neurosciences.

[66]  Minoru Tsukada,et al.  Temporal pattern discrimination in the cat's retinal cells and Markov system models , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[67]  Geoffrey E. Hinton,et al.  Spatial coherence as an internal teacher for a neural network , 1995 .