Learning auditory discriminations from observation is efficient but less robust than learning from experience

Social learning enables complex societies. However, it is largely unknown how insights obtained from observation compare with insights gained from trial-and-error, in particular in terms of their robustness. Here, we use aversive reinforcement to train “experimenter” zebra finches to discriminate between auditory stimuli in the presence of an “observer” finch. We show that experimenters are slow to successfully discriminate the stimuli, but immediately generalize their ability to a new set of similar stimuli. By contrast, observers subjected to the same task are able to discriminate the initial stimulus set, but require more time for successful generalization. Drawing on concepts from machine learning, we suggest that observer learning has evolved to rapidly absorb sensory statistics without pressure to minimize neural resources, whereas learning from experience is endowed with a form of regularization that enables robust inference.Many animals can learn, not just by direct experience, but by observing another animal performing a task. Here, the authors show in zebra finches that observer learning is efficient, but differs from direct learning in that it is less generalizable to novel stimuli.

[1]  Dirk Kerzel,et al.  in press: Journal of Experimental Psychology: Human Perception and Performance , 2020 .

[2]  D. Lizotte Practical bayesian optimization , 2008 .

[3]  J. Kerr,et al.  Dopamine Receptor Activation Is Required for Corticostriatal Spike-Timing-Dependent Plasticity , 2008, The Journal of Neuroscience.

[4]  A. Cherkin Kinetics of memory consolidation: role of amnesic treatment parameters. , 1969, Proceedings of the National Academy of Sciences of the United States of America.

[5]  M. Schlund,et al.  Generalization of socially transmitted and instructed avoidance , 2015, Front. Behav. Neurosci..

[6]  W. Schultz Predictive reward signal of dopamine neurons. , 1998, Journal of neurophysiology.

[7]  S. Royer,et al.  Conservation of total synaptic weight through balanced synaptic depression and potentiation , 2003, Nature.

[8]  S. Mineka,et al.  Observational conditioning of snake fear in rhesus monkeys. , 1984, Journal of abnormal psychology.

[9]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[10]  Angela J. Yu,et al.  Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[11]  Luca Passamonti,et al.  A Key Role for Similarity in Vicarious Reward , 2009, Science.

[12]  C. Mathys,et al.  Hierarchical Prediction Errors in Midbrain and Basal Forebrain during Sensory Learning , 2013, Neuron.

[13]  C. E. Ho,et al.  A procedure for an automated measurement of song similarity , 2000, Animal Behaviour.

[14]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[15]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[16]  R. Lachlan,et al.  Social learning of food types in zebra finches (Taenopygia guttata) is directed by demonstrator sex and feeding activity , 2003, Animal Cognition.

[17]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[18]  R. Byrne Imitation as behaviour parsing. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[19]  A novel paradigm for auditory discrimination training with social reinforcement in songbirds , 2014, bioRxiv.

[20]  Steve W. C. Chang,et al.  Social learning through prediction error in the brain , 2017, npj Science of Learning.

[21]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[22]  R. Weisman,et al.  Song-note discriminations in zebra finches (Taeniopygia guttata): Categories and pseudocategories , 1999 .

[23]  K. Laland,et al.  Social Learning: An Introduction to Mechanisms, Methods, and Models , 2013 .

[24]  C. Mathys,et al.  Hierarchical Prediction Errors in Midbrain and Basal Forebrain during Sensory Learning , 2013, Neuron.

[25]  Mark W Woolrich,et al.  Associative learning of social value , 2008, Nature.

[26]  David L Wright,et al.  Generalization of action knowledge following observational learning. , 2011, Acta psychologica.

[27]  K. Doya,et al.  Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[28]  C. L. Hull,et al.  The irradiation of a tactile conditioned reflex in man. , 1934 .

[29]  Carel Ten Cate,et al.  Budgerigars and zebra finches differ in how they generalize in an artificial grammar learning experiment , 2016, Proceedings of the National Academy of Sciences.

[30]  O. Tchernichovski,et al.  Sexual dimorphism in striatal dopaminergic responses promotes monogamy in social songbirds , 2017, eLife.

[31]  Richard H R Hahnloser,et al.  A Higher Sensory Brain Region Is Involved in Reversing Reinforcement-Induced Vocal Changes in a Songbird , 2014, The Journal of Neuroscience.

[32]  W. Hamilton,et al.  The Evolution of Cooperation , 1984 .

[33]  J. Goldberg,et al.  Dopamine neurons encode performance error in singing birds , 2016, Science.

[34]  Geoffrey Bird,et al.  Effector-dependent learning by observation of a finger movement sequence. , 2005, Journal of experimental psychology. Human perception and performance.

[35]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[36]  Thomas R. Zentall,et al.  Imitation: definitions, evidence, and mechanisms , 2006, Animal Cognition.

[37]  E. Markman,et al.  Children's sensitivity to constraints on word meaning: Taxonomic versus thematic relations , 1984, Cognitive Psychology.

[38]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[39]  N. Sadato,et al.  Processing of Social and Monetary Rewards in the Human Striatum , 2008, Neuron.

[40]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[41]  E. Fischer Conditioned Reflexes , 1942, American journal of physical medicine.

[42]  M. Bitterman,et al.  Classical conditioning of proboscis extension in honeybees (Apis mellifera). , 1983, Journal of comparative psychology.

[43]  J. Hollerman,et al.  Influence of reward expectation on behavior-related neuronal activity in primate striatum. , 1998, Journal of neurophysiology.

[44]  B. Hangya,et al.  Central Cholinergic Neurons Are Rapidly Recruited by Reinforcement Feedback , 2015, Cell.

[45]  Temporal integration in zebra finches (Poephila guttata). , 1990, The Journal of the Acoustical Society of America.

[46]  T. Lillicrap,et al.  Why Copy Others? Insights from the Social Learning Strategies Tournament , 2010, Science.