Embodied Conversation

The chapter presents an overview of human-computer interfaces, which are a crucial element of an ambient intelligence solution. The focus is given to the embodied conversational agents, which are needed to communicate with users in a most natural way. Different input and output modalities, with supporting methods, to process the captured information (e.g., automatic speech recognition, gesture recognition, natural language processing, dialog processing, text to speech synthesis, etc.), have the crucial role to provide the high level of quality of experience to the user. As an example, usage of embodied conversational agent for e-Health domain is proposed.

[1]  Matej Rojc,et al.  The TTS-driven affective embodied conversational agent EVA, based on a novel conversational-behavior generation algorithm , 2017, Eng. Appl. Artif. Intell..

[2]  Nicole C. Krämer,et al.  Social snacking with a virtual agent - On the interrelation of need to belong and effects of social responsiveness when interacting with artificial entities , 2018, Int. J. Hum. Comput. Stud..

[3]  Slawomir Zadrozny,et al.  Computing With Words Is an Implementable Paradigm: Fuzzy Queries, Linguistic Data Summaries, and Natural-Language Generation , 2010, IEEE Transactions on Fuzzy Systems.

[4]  Brent Morgan,et al.  Assessment with computer agents that engage in conversational dialogues and trialogues with learners , 2017, Comput. Hum. Behav..

[5]  Awodele Oludele,et al.  IP Multimedia Subsystem: Overview and Services , 2010 .

[6]  Peter A. Gloor,et al.  In the shades of the uncanny valley: An experimental study of human-chatbot interaction , 2018, Future Gener. Comput. Syst..

[7]  Matej Rojc,et al.  A Novel Realizer of Conversational Behavior for Affective and Personalized Human Machine Interaction - EVA U-Realizer , 2018 .

[8]  Francisco J. Serón,et al.  Smart tourist information points by combining agents, semantics and AI techniques , 2017, Comput. Sci. Inf. Syst..

[9]  Catherine Pelachaud,et al.  A User Perception--Based Approach to Create Smiling Embodied Conversational Agents , 2017, ACM Trans. Interact. Intell. Syst..

[10]  Kenton O'Hara,et al.  Talking with Conversational Agents in Collaborative Action , 2017, CSCW Companion.

[11]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[12]  Michael I. Mandel,et al.  Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[13]  M. Mori THE UNCANNY VALLEY , 2020, The Monster Theory Reader.

[14]  Tanja Schultz,et al.  Automatic speech recognition for under-resourced languages: A survey , 2014, Speech Commun..

[15]  Heidi Christensen,et al.  Speech-enabled environmental control in an AAL setting for people with speech disorders: a case study , 2015 .

[16]  W. R. Ford,et al.  Real conversations with artificial intelligence: A comparison between human-human online conversations and human-chatbot conversations , 2015, Comput. Hum. Behav..

[17]  Erik Cambria,et al.  A review of affective computing: From unimodal analysis to multimodal fusion , 2017, Inf. Fusion.

[18]  Niall Firth Meet Siri's rival, Iris, the future of chatbots , 2017 .

[20]  Michel Vacher,et al.  Development of Automatic Speech Recognition Techniques for Elderly Home Support: Applications and Challenges , 2015, HCI.

[21]  Darinka Verdonik Internal variety in the use of Slovene general extenders in different spoken discourse settings , 2015 .

[22]  M. A. Anusuya,et al.  Front end analysis of speech recognition: a review , 2011, Int. J. Speech Technol..

[23]  Jeff A. Bilmes,et al.  What HMMs Can Do , 2006, IEICE Trans. Inf. Syst..

[24]  Dimitra Anastasiou,et al.  Multi-lingual and -modal Applications in the Semantic Web: the example of Ambient Assisted Living , 2015 .

[25]  Heloir,et al.  The Uncanny Valley , 2019, The Animation Studies Reader.

[26]  CambriaErik,et al.  A review of affective computing , 2017 .

[27]  Alessandro Bogliolo,et al.  The Rise of Bots: A Survey of Conversational Interfaces, Patterns, and Paradigms , 2017, Conference on Designing Interactive Systems.

[28]  Wagdy A. Aziz,et al.  Design and Analysis of IP-Multimedia Subsystem (IMS) , 2012 .

[29]  Abigail Sellen,et al.  "Like Having a Really Bad PA": The Gulf between User Expectation and Experience of Conversational Agents , 2016, CHI.

[30]  Luke K. Fryer,et al.  Stimulating and sustaining interest in a language course: An experimental comparison of Chatbot and Human task partners , 2017, Comput. Hum. Behav..

[31]  Neill Wilkinson,et al.  Next Generation Network Services , 2002 .

[32]  Stefan Kopp,et al.  Using cognitive models to understand multimodal processes: the case for speech and gesture production , 2017, The Handbook of Multimodal-Multisensor Interfaces, Volume 1.

[33]  Alfred Mertins,et al.  Automatic speech recognition and speech variability: A review , 2007, Speech Commun..

[34]  Jörn Ostermann,et al.  Visual speech synthesis from 3D mesh sequences driven by combined speech features , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[35]  Yanghee Kim,et al.  Research-Based Design of Pedagogical Agent Roles: a Review, Progress, and Recommendations , 2016, International Journal of Artificial Intelligence in Education.

[36]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.