Learning to perform auditory discriminations from observation is efficient but less robust than learning from experience

Social learning enables complex societies. However, it is largely unknown how insights obtained from observation compare with insights gained from trial-and-error, in particular in terms of their robustness. We use aversive reinforcement to train “experimenter” zebra finches to discriminate between auditory stimuli in the presence of an “observer” finch. We find that experimenters are slow to successfully discriminate the stimuli but immediately generalize their ability to a new set of similar stimuli. By contrast, observers subjected to the same task instantly discriminate the initial stimulus set, but require more time for successful generalization. Drawing upon machine learning insights, we suggest that observer learning has evolved to rapidly absorb sensory statistics without pressure to minimize neural resources, whereas learning from experience is endowed with a form of regularization that enables robust inference.

[1]  S. Royer,et al.  Conservation of total synaptic weight through balanced synaptic depression and potentiation , 2003, Nature.

[2]  N. Sadato,et al.  Processing of Social and Monetary Rewards in the Human Striatum , 2008, Neuron.

[3]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[4]  K. Doya,et al.  Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[5]  O. Tchernichovski,et al.  A novel paradigm for auditory discrimination training with social reinforcement in songbirds , 2014, bioRxiv.

[6]  Angela J. Yu,et al.  Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[7]  R. Byrne Imitation as behaviour parsing. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[8]  T. Lillicrap,et al.  Why Copy Others? Insights from the Social Learning Strategies Tournament , 2010, Science.

[9]  I. Pavlov,et al.  Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex. , 1929, Annals of neurosciences.

[10]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[11]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[12]  C. E. Ho,et al.  A procedure for an automated measurement of song similarity , 2000, Animal Behaviour.

[13]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[14]  A. Cherkin Kinetics of memory consolidation: role of amnesic treatment parameters. , 1969, Proceedings of the National Academy of Sciences of the United States of America.

[15]  E. Markman,et al.  Children's sensitivity to constraints on word meaning: Taxonomic versus thematic relations , 1984, Cognitive Psychology.

[16]  H. Roche,et al.  Why Copy Others? Insights from the Social Learning Strategies Tournament , 2010 .

[17]  J. Kerr,et al.  Dopamine Receptor Activation Is Required for Corticostriatal Spike-Timing-Dependent Plasticity , 2008, The Journal of Neuroscience.

[18]  J. Goldberg,et al.  Dopamine neurons encode performance error in singing birds , 2016, Science.

[19]  C. Mathys,et al.  Hierarchical Prediction Errors in Midbrain and Basal Forebrain during Sensory Learning , 2013, Neuron.

[20]  M. Bitterman,et al.  Classical conditioning of proboscis extension in honeybees (Apis mellifera). , 1983, Journal of comparative psychology.

[21]  Richard H R Hahnloser,et al.  A Higher Sensory Brain Region Is Involved in Reversing Reinforcement-Induced Vocal Changes in a Songbird , 2014, The Journal of Neuroscience.

[22]  K. Laland,et al.  Social Learning: An Introduction to Mechanisms, Methods, and Models , 2013 .

[23]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[24]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[25]  Carel Ten Cate,et al.  Budgerigars and zebra finches differ in how they generalize in an artificial grammar learning experiment , 2016, Proceedings of the National Academy of Sciences.

[26]  W. Schultz Predictive reward signal of dopamine neurons. , 1998, Journal of neurophysiology.

[27]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[28]  Maria Adler,et al.  Science and human behavior , 2017 .

[29]  Luca Passamonti,et al.  A Key Role for Similarity in Vicarious Reward , 2009, Science.

[30]  F. Nottebohm,et al.  Dynamics of the Vocal Imitation Process: How a Zebra Finch Learns Its Song , 2001, Science.

[31]  Steve W. C. Chang,et al.  Social learning through prediction error in the brain , 2017, npj Science of Learning.

[32]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .

[33]  Mark W Woolrich,et al.  Associative learning of social value , 2008, Nature.

[34]  Nestor A. Schmajuk,et al.  Classical conditioning , 2008, Scholarpedia.

[35]  M. Daniell The Elements of STRATEGY , 2006 .

[36]  R. Dooling,et al.  Temporal integration in zebra finches (Poephila guttata). , 1990, The Journal of the Acoustical Society of America.

[37]  Allison J Doupe,et al.  Social Context–Induced Song Variation Affects Female Behavior and Gene Expression , 2008, PLoS biology.

[38]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[39]  S. Derégnaucourt,et al.  Comparisons of different methods to train a young zebra finch (Taeniopygia guttata) to learn a song , 2013, Journal of Physiology-Paris.

[40]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[41]  R. Weisman,et al.  Song-note discriminations in zebra finches (Taeniopygia guttata): Categories and pseudocategories , 1999 .

[42]  B. Hangya,et al.  Central Cholinergic Neurons Are Rapidly Recruited by Reinforcement Feedback , 2015, Cell.

[43]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[44]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[45]  J. Galef • IMITATION IN ANIMALS: HISTORY, DEFINITION, AND INTERPRETATION OF DATA FROM THE PSYCHOLOGICAL LABORATORY , 2013 .

[46]  Thomas R. Zentall,et al.  Imitation: definitions, evidence, and mechanisms , 2006, Animal Cognition.

[47]  Anton Ford The Representation of Action , 2017, Royal Institute of Philosophy Supplement.

[48]  J. Hollerman,et al.  Influence of reward expectation on behavior-related neuronal activity in primate striatum. , 1998, Journal of neurophysiology.

[49]  C. L. Hull,et al.  The irradiation of a tactile conditioned reflex in man. , 1934 .