Emoty: an Emotionally Sensitive Conversational Agent for People with Neurodevelopmental Disorders

Our research aims at exploiting the advances in conversational technology to support people with Neurodevelopmental Disorder (NDD). NDD is a group of conditions that are characterized by severe deficits in the cognitive, emotional and motor areas and produce severe impairments in communication and social functioning. This paper presents the design, technology and exploratory evaluation of Emoty, a spoken Conversational Agent (CA) created specifically for individuals with NDD. The goal of Emoty is to help these persons enhancing communication abilities related to emotional recognition and expression, which are fundamental in any form of human relationship. The system exploits emotion detection capabilities based on the semantics of the speech by calling the IBM Watson Tone Analyzer API and from the harmonic features of the audio thanks to an “all-of-us” Deep Learning model. The design and evaluation of Emoty are based on the close collaboration among computer engineers and specialists in NDD (psychologists, neurological doctors, educators). 1. Background and Introduction The general scope of our research is to exploit the advances in Conversational Technology to support persons with Neurodevelopmental Disorder (NDD). In particular, we investigate the use of spoken Conversational Agents to mitigate the impairments of these persons related to the difficulty of recognizing and expressing emotions – a problem clinically referred to as Alexithymia. Conversational Technology is a general term for integrated technologies that encompass results in various fields: machine learning, natural language processing, speech recognition, dialog generation, and human–computer interaction, among others. A Conversational Agent is a system that exploits Conversational Technology to interpret and respond to statements made by the users in natural language. Neurodevelopmental Disorder (NDD) denotes a group of conditions that are characterized by severe deficits in the cognitive, emotional and motor areas and produce severe impairments in social functioning. Its causes can be genetic or result from lesions or environmental factors. The range of developmental deficits varies from specific limitations of learning or control of executive functions to global impairments of social skills or intelligence. ID (Intellectual Disability), ADHD (Attention Deficit Hyperactivity Disorder) and ASD (Autistic Spectrum Disorder) are all classified as forms of NDD [1][2]. NDD affects at least 3% of the world population. The number of people with the most common form of NDD, autism, are currently 60 million all over the world. Just in the United States, the social cost in order to care of autistic people’s needs amounted to $367 billion in 2015 and, if autism’s prevalence continues the steep rise seen over the last decade, the projected costs will top $1 trillion by 2025 [3]. Most kinds of NDD are chronic, but early and focused interventions are thought to at least mitigate its effect [4][5]. Alexithymia is a personality construct characterized by the subclinical inability to identify and describe emotions in the self. The core characteristics of alexithymia – which often occur among persons with NDD are marked dysfunction in emotional awareness, social attachment, and interpersonal relating. Furthermore, people with alexithymia have difficulty in distinguishing and appreciating the emotions of others, which is thought to lead to nonempathic and ineffective emotional responding. Alexithymia is traditionally treated with counselling or Proceedings of the 52nd Hawaii International Conference on System Sciences | 2019 URI: https://hdl.handle.net/10125/59641 ISBN: 978-0-9981331-2-6 (CC BY-NC-ND 4.0) Page 2014 talk therapies that involve various techniques: group conversations, individual reading of emotional stories, engaging in creative art, daily journaling, and relaxation techniques. While these methods are reported to bring some benefits in mitigating alexithymia effects, specialists are also looking for new approaches [13]. The role of interactive technology in NDD has been explored in several studies. On the one hand, the high children’s exposure to chaotic sensory stimulations given, for example, by videogames and multimedia applications have been blamed as one of the causes of the increasing number of cases of cognitive disorders during the developmental age [6]. On the other hand, recent research about children’s development has acknowledged interactive technology as a potentially useful tool to support existing therapies and new approaches to the improve the learning process [7][8][9][10][11][12]. Embracing this later vision, we have developed a novel conversational service called Emoty that plays the role of emotional facilitator and trainer for persons with NDD who manifest severe forms of Alexithymia. Emoty is a voice-based Italian speaking Dialog System able to converse with the users in ordinary natural language and to entertain them with small talks and educational games. Emoty does not act as a virtual assistant for daily life support but aims at helping people with NDD to develop a better emotional control and self-awareness, which would lead them to enhance their communication capabilities and consequently to improve their quality of life. Emoty exploits conversational technologies, Machine Learning and Deep Learning techniques for emotion recognition from voice sentences based on the processing of user’s audio pitch. The project has been carried out in close collaboration with psychologists, neurological doctors and caregivers who actively participated in eliciting the key requirements, evaluating iterative prototypes, and performing an empirical evaluation. To our knowledge, Emoty is the first conversational agent that addresses the needs of emotional learning among persons with NDD, with the hope of contributing to improve their communication skills. The originality of Emoty is also in the exploitation of audio pitch for emotion recognition from user’s aural dialogues. From a more general perspective, our research might pave the ground towards a better understanding of the cognitive and emotional mechanisms associated to NDD and towards new forms of therapeutic interventions for these subjects. The rest of the paper is organized as follows. In Section 2, we provide an overview of the state of the art about artificial conversational agents and automatic emotion recognition systems from the audio pitch, and then we give an overview of the existing technologies supporting people with NDD. In Section 3, we describe Emoty from a high-level point of view, considering functional and non-functional aspects. Section 4 illustrates the system architecture and its core modules. Particularly relevant are the conversational module and the Cognitive Computing unit, responsible for the emotion detection from the harmonic features of the audio. Section 5 and Section 6 describe the procedure that has been followed to collect and analyze data during the first exploratory study, and its earliest results. Section 7 provides the general conclusions and outline the following steps in our research.

[1]  P. Wilson,et al.  The Nature of Emotions , 2012 .

[2]  N. O. Obiyo,et al.  The Use of Ict as an Integral Teaching and Learning Tool for Children with Autism: A Challenge for Nigeria Education System , 2013 .

[3]  M. Guralnick,et al.  Why Early Intervention Works: A Systems Perspective , 2011, Infants and young children.

[4]  Gordon Alley-Young Technology Tools for Students with Autism: Innovations that Enhance Independence and Learning , 2016 .

[5]  Kelsey Rising Use of Classroom Technology to Promote Learning Among Students with Autism , 2017 .

[6]  Shwetak N. Patel,et al.  Convey: Exploring the Use of a Context View for Chatbots , 2018, CHI.

[7]  G. Cioni,et al.  Early intervention in neurodevelopmental disorders: underlying neural mechanisms , 2016, Developmental medicine and child neurology.

[8]  M. Lumley,et al.  The Assessment of Alexithymia in Medical Settings: Implications for Understanding and Treating Health Problems , 2007, Journal of personality assessment.

[9]  M. Silver,et al.  Evaluation of a New Computer Intervention to Teach People with Autism or Asperger Syndrome to Recognize and Predict Emotions in Others , 2001, Autism : the international journal of research and practice.

[10]  Giovanni Costantini,et al.  EMOVO Corpus: an Italian Emotional Speech Database , 2014, LREC.

[11]  Erik Marchi,et al.  Recent developments and results of ASC-Inclusion: An Integrated Internet-Based Environment for Social Inclusion of Children with Autism Spectrum Conditions , 2015, IUI 2015.

[12]  D. Kolb Experiential Learning: Experience as the Source of Learning and Development , 1983 .

[13]  Jan Derboven,et al.  Designing voice interaction for people with physical and speech impairments , 2014, NordiCHI.

[14]  C. Chibelushi,et al.  Facial Expression Recognition : A Brief Tutorial Overview , 2022 .

[15]  Emily C. Bouck,et al.  High-Tech or Low-Tech? Comparing Self-Monitoring Systems to Increase Task Independence for Students With Autism , 2014 .

[16]  Daantje Derks,et al.  Emoticons and Online Message Interpretation , 2008 .

[17]  K. Sofronoff,et al.  A multi-component social skills intervention for children with Asperger syndrome: the Junior Detective Training Program. , 2008, Journal of child psychology and psychiatry, and allied disciplines.

[18]  Gerald Penn,et al.  Designing Pronunciation Learning Tools: The Case for Interactivity against Over-Engineering , 2018, CHI.

[19]  Simin Ghavifekr "TECHNOLOGY- BASED TEACHING AND LEARNING :A QUANTITATIVE ANALYSIS ON EFFECTIVENESS OF ICT INTEGRATION IN SCHOOLS" , 2015 .

[20]  Erik Marchi,et al.  Asc-inclusion: Interactive Emotion Games for Social Inclusion of Children with Autism Spectrum Conditions , 2022 .

[21]  Leah Findlater,et al.  "Accessibility Came by Accident": Use of Voice-Controlled Intelligent Personal Assistants by People with Disabilities , 2018, CHI.

[22]  Matthew S. Goodwin,et al.  Technology for just-in-time in-situ learning of facial affect for persons diagnosed with an autism spectrum disorder , 2008, Assets '08.

[23]  Irina Verenikina,et al.  The Digital Technology in the Learning of Students with Autism Spectrum Disorders (ASD) in Applied Classroom Settings , 2010 .

[24]  M. Irigoyen,et al.  Exposure and Use of Mobile Media Devices by Young Children , 2015, Pediatrics.

[25]  Stefan Steidl,et al.  Automatic classification of emotion related user states in spontaneous children's speech , 2009 .

[26]  Shrikanth S. Narayanan,et al.  Rachel: Design of an emotionally targeted interactive agent for children with autism , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[27]  Harald Sontheimer,et al.  Diseases of the Nervous System , 1877, The Hospital.

[28]  Jichen Zhu,et al.  Patterns for How Users Overcome Obstacles in Voice User Interfaces , 2018, CHI.

[29]  Randolph G. Bias,et al.  Research Methods for Human-Computer Interaction , 2010, J. Assoc. Inf. Sci. Technol..

[30]  James W Tanaka,et al.  Using computerized games to teach face recognition skills to children with autism spectrum disorder: the Let's Face It! program. , 2010, Journal of child psychology and psychiatry, and allied disciplines.

[31]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..