The Inevitability of Probability: Probabilistic Inference in Generic Neural Networks Trained with Non-Probabilistic Feedback

Humans and other animals have been shown to perform near-optimal probabilistic inference in a wide range of psychophysical tasks. On the face of it, this is surprising because optimal probabilistic inference in each case is associated with highly non-trivial behavioral strategies. Yet, typically subjects receive little to no feedback during most of these tasks and the received feedback is not explicitly probabilistic in nature. How can subjects learn such non-trivial behavioral strategies from scarce non-probabilistic feedback? We show that generic feed-forward and recurrent neural networks trained with a relatively small number of non-probabilistic examples using simple error-based learning rules can perform near-optimal probabilistic inference in standard psychophysical tasks. The hidden layers of the trained networks develop a novel sparsity-based probabilistic population code. In all tasks, performance asymptotes at very small network sizes, usually on the order of tens of hidden units, due to the low computational complexity of the typical psychophysical tasks. For the same reason, the trained networks also display remarkable generalization to stimulus conditions not seen during training. We further show that in a probabilistic binary categorization task involving arbitrary categories where both human and monkey subjects have been shown to perform probabilistic inference, a monkey subject's performance (but not human subjects' performance) is consistent with an error-based learning rule. Our results suggest that near-optimal probabilistic inference in standard psychophysical tasks emerges naturally and robustly in generic neural networks trained with error-based learning rules, even when neither the training objective nor the training examples are explicitly probabilistic, and that these types of networks can be used as simple plausible neural models of probabilistic inference.

[1]  Timothy D. Hanks,et al.  Probabilistic Population Codes for Bayesian Decision Making , 2008, Neuron.

[2]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[3]  Robert A Jacobs,et al.  Bayesian integration of visual and auditory signals for spatial localization. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[4]  J L Gallant,et al.  Sparse coding and decorrelation in primary visual cortex during natural vision. , 2000, Science.

[5]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[6]  Christopher R Fetsch,et al.  Neural correlates of reliability-based cue weighting during multisensory integration , 2011, Nature Neuroscience.

[7]  Kenneth H Britten,et al.  Linear responses to stochastic motion signals in area MST. , 2007, Journal of neurophysiology.

[8]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[9]  Yoshua Bengio,et al.  Towards Biologically Plausible Deep Learning , 2015, ArXiv.

[10]  K. H. Britten,et al.  Responses of neurons in macaque MT to stochastic motion signals , 1993, Visual Neuroscience.

[11]  James M. Hillis,et al.  Slant from texture and disparity cues: optimal cue combination. , 2004, Journal of vision.

[12]  Duje Tadin,et al.  Unifying account of visual motion and position perception , 2015, Proceedings of the National Academy of Sciences.

[13]  A. Tolias,et al.  Trial-to-trial, uncertainty-based adjustment of decision boundaries in visual categorization , 2013, Proceedings of the National Academy of Sciences.

[14]  Daniel Cownden,et al.  Random feedback weights support learning in deep neural networks , 2014, ArXiv.

[15]  Michael I. Jordan,et al.  An internal model for sensorimotor integration. , 1995, Science.

[16]  A. Pouget,et al.  Marginalization in Neural Circuits with Divisive Normalization , 2011, The Journal of Neuroscience.

[17]  Alexandre Pouget,et al.  Probabilistic Interpretation of Population Codes , 1996, Neural Computation.

[18]  A. Pouget,et al.  Behavior and neural basis of near-optimal visual search , 2011, Nature Neuroscience.

[19]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[20]  J. Rinn,et al.  DeCoN: Genome-wide Analysis of In Vivo Transcriptional Dynamics during Pyramidal Neuron Fate Selection in Neocortex , 2015, Neuron.

[21]  G. DeAngelis,et al.  Multisensory Integration in Macaque Visual Cortex Depends on Cue Reliability , 2008, Neuron.

[22]  Barak A. Pearlmutter,et al.  Equivalence Proofs for Multi-Layer Perceptron Classifiers and the Bayesian Discriminant Function , 1991 .

[23]  J. Poulet,et al.  Synaptic Mechanisms Underlying Sparse Coding of Active Touch , 2011, Neuron.

[24]  Chris Eliasmith,et al.  Solving the Problem of Negative Synaptic Weights in Cortical Models , 2008, Neural Computation.

[25]  W. Newsome,et al.  Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.

[26]  Dan D. Stettler,et al.  Representations of Odor in the Piriform Cortex , 2009, Neuron.

[27]  Matthew R. Krause,et al.  Synaptic and Network Mechanisms of Sparse and Reliable Visual Cortical Activity during Nonclassical Receptive Field Stimulation , 2010, Neuron.

[28]  R. Freeman,et al.  Orientation selectivity in the cat's striate cortex is invariant with stimulus contrast , 2004, Experimental Brain Research.

[29]  W. Ma,et al.  Towards a neural implementation of causal inference in cue combination. , 2013, Multisensory research.

[30]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[31]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[32]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[33]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[34]  Richard A. Andersen,et al.  A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.

[35]  M. Carandini,et al.  Inhibition dominates sensory responses in awake cortex , 2012, Nature.

[36]  D M Merfeld,et al.  Humans use internal models to estimate gravity and linear acceleration , 1999, Nature.

[37]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[38]  Konrad Paul Kording,et al.  Causal Inference in Multisensory Perception , 2007, PloS one.

[39]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.