Reward-based learning for virtual neurorobotics through emotional speech processing

Reward-based learning can easily be applied to real life with a prevalence in children teaching methods. It also allows machines and software agents to automatically determine the ideal behavior from a simple reward feedback (e.g., encouragement) to maximize their performance. Advancements in affective computing, especially emotional speech processing (ESP) have allowed for more natural interaction between humans and robots. Our research focuses on integrating a novel ESP system in a relevant virtual neurorobotic (VNR) application. We created an emotional speech classifier that successfully distinguished happy and utterances. The accuracy of the system was 95.3 and 98.7% during the offline mode (using an emotional speech database) and the live mode (using live recordings), respectively. It was then integrated in a neurorobotic scenario, where a virtual neurorobot had to learn a simple exercise through reward-based learning. If the correct decision was made the robot received a spoken reward, which in turn stimulated synapses (in our simulated model) undergoing spike-timing dependent plasticity (STDP) and reinforced the corresponding neural pathways. Both our ESP and neurorobotic systems allowed our neurorobot to successfully and consistently learn the exercise. The integration of ESP in real-time computational neuroscience architecture is a first step toward the combination of human emotions and virtual neurorobotics.

[1]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[2]  Oudeyer Pierre-Yves,et al.  The production and recognition of emotions in speech: features and algorithms , 2003 .

[3]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[4]  N. Logothetis,et al.  Where Are the Human Speech and Voice Regions, and Do Other Animals Have Anything Like Them? , 2009, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[5]  L. Abbott,et al.  Competitive Hebbian learning through spike-timing-dependent synaptic plasticity , 2000, Nature Neuroscience.

[6]  Shashidhar G. Koolagudi,et al.  Speech Emotion Recognition Using Segmental Level Prosodic Analysis , 2011, 2011 International Conference on Devices and Communications (ICDeCom).

[7]  Sergiu-Mihai Dascalu,et al.  Framework and Implications of Virtual Neurorobotics , 2008, Front. Neurosci..

[8]  Frederick C Harris,et al.  Breaking the virtual barrier: real-time interactions with spiking neural models , 2010, BMC Neuroscience.

[9]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[10]  Britta Wrede,et al.  Playing a different imitation game: Interaction with an Empathic Android Robot , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[11]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[12]  P. Belin,et al.  Thinking the voice: neural correlates of voice perception , 2004, Trends in Cognitive Sciences.

[13]  Rich Drewes Brainlab: a toolkit to aid in the design, simulation, and analysis of spiking neural networks with the NCS environment , 2005 .

[14]  Rosalind W. Picard Affective computing: challenges , 2003, Int. J. Hum. Comput. Stud..

[15]  Eva Hudlicka,et al.  To feel or not to feel: The role of affect in human-computer interaction , 2003, Int. J. Hum. Comput. Stud..

[16]  Y. Dan,et al.  Spike timing-dependent plasticity: a Hebbian learning rule. , 2008, Annual review of neuroscience.

[17]  Angela D. Friederici,et al.  The Developmental Origins of Voice Processing in the Human Brain , 2010, Neuron.

[18]  Y. Dan,et al.  Spike Timing-Dependent Plasticity of Neural Circuits , 2004, Neuron.

[19]  Cynthia Breazeal,et al.  Recognition of Affective Communicative Intent in Robot-Directed Speech , 2002, Auton. Robots.

[20]  Anna S. Hasting,et al.  Decoding Modality-Independent Emotion Perception in Medial Prefrontal and Superior Temporal Cortex , 2010, The Journal of Neuroscience.

[21]  Frederick C. Harris,et al.  Modeling oxytocin induced neurorobotic trust and intent recognition in human-robot interaction , 2011, The 2011 International Joint Conference on Neural Networks.

[22]  Frederick C. Harris,et al.  A Circuit-Level Model of Hippocampal Place Field Dynamics Modulated by Entorhinal Grid and Suppression-Generating Cells , 2010, Front. Neural Circuits.

[23]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[24]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[25]  Pierre-Yves Oudeyer,et al.  The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[26]  Anders Green,et al.  Social and collaborative aspects of interaction with a service robot , 2003, Robotics Auton. Syst..

[27]  Yongmei Zhang,et al.  Classifier fusion for speech emotion recognition , 2010, 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[28]  Wulfram Gerstner,et al.  A History of Spike-Timing-Dependent Plasticity , 2011, Front. Syn. Neurosci..

[29]  N. Logothetis,et al.  A voice region in the monkey brain , 2008, Nature Neuroscience.

[30]  E. Izhikevich Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.

[31]  Quan Zou,et al.  Kinetic models of spike-timing dependent plasticity and their functional consequences in detecting correlations , 2007, Biological Cybernetics.

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  Ioannis Pitas,et al.  Automatic emotional speech classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[34]  Kornel Laskowski,et al.  Combining Efforts for Improving Automatic Classification of Emotional User States , 2006 .

[35]  Marie Tahon,et al.  Real-Life Emotion Detection from Speech in Human-Robot Interaction: Experiments Across Diverse Corpora with Child and Adult Voices , 2011, INTERSPEECH.

[36]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[37]  Frederick C. Harris,et al.  Implementation of a Biologically Realistic Parallel Neocortical-Neural Network Simulator , 2001, PPSC.

[38]  David Talkin,et al.  A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .

[39]  Frederick C. Harris,et al.  Real-time human-robot interaction underlying neurorobotic trust and intent recognition , 2012, Neural Networks.

[40]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[41]  Björn W. Schuller,et al.  OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[42]  Nicholas T. Carnevale,et al.  Simulation of networks of spiking neurons: A review of tools and strategies , 2006, Journal of Computational Neuroscience.

[43]  Toshi Takamori,et al.  Multi-Modal Interaction of Human and Home Robot in the Context of Room Map Generation , 2002, Auton. Robots.

[44]  Quan Zou,et al.  Brainlab: A Python Toolkit to Aid in the Design, Simulation, and Analysis of Spiking Neural Networks with the NeoCortical Simulator , 2008, Front. Neuroinform..

[45]  S. Scott,et al.  Positive Emotions Preferentially Engage an Auditory–Motor “Mirror” System , 2006, The Journal of Neuroscience.

[46]  Walter Senn,et al.  Spatio-Temporal Credit Assignment in Neuronal Population Learning , 2011, PLoS Comput. Biol..

[47]  Frederick C. Harris,et al.  Real-Time Emotional Speech Processing for Neurorobotics Applications , 2010, CAINE.

[48]  Sergiu-Mihai Dascalu,et al.  Virtual Neurorobotics (VNR) to Accelerate Development of Plausible Neuromorphic Brain Architectures , 2007, Frontiers in neurorobotics.

[49]  Razvan V. Florian,et al.  Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity , 2007, Neural Computation.

[50]  Narayan Srinivasa,et al.  A Spiking Neural Model for Stable Reinforcement of Synapses Based on Multiple Distal Rewards , 2013, Neural Computation.

[51]  John L. Prince,et al.  Discussion and Future Work , 1994 .

[52]  Li I. Zhang,et al.  A critical window for cooperation and competition among developing retinotectal synapses , 1998, Nature.

[53]  K. Scherer,et al.  Vocal cues in emotion encoding and decoding , 1991 .

[54]  Henning Sprekeler,et al.  Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity , 2010, The Journal of Neuroscience.