Tracking and fusion for multiparty interaction with a virtual character and a social robot

To give human-like capabilities to artificial characters, we should equip them with the ability of inferring user states. These artificial characters should understand the users' behaviors through various sensors and respond back using multimodal output. Besides natural multimodal interaction, they should also be able to communicate with multiple users and among each other in multiparty interactions. Previous work on interactive virtual humans and social robots mainly focuses on one-to-one interactions. In this paper, we study tracking and fusion aspects of multiparty interactions. We first give a general overview of our proposed multiparty interaction system and mention how it is different from previous work. Then, we provide the details of the tracking and fusion component including speaker identification, addressee detection and a dynamic user entrance/leave mechanism based on user re-identification using a Kinect sensor. Finally, we present a case study with the system and provide a discussion on the current capabilities, limitations and future work.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Eric Horvitz,et al.  Computational Models for Multiparty Turn-Taking , 2010 .

[3]  Gang Wang,et al.  Optimizing LBP Structure For Visual Recognition Using Binary Quadratic Programming , 2014, IEEE Signal Processing Letters.

[4]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[5]  Shamik Sural,et al.  Segmentation and histogram generation using the HSV color space for image retrieval , 2002, Proceedings. International Conference on Image Processing.

[6]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Nadia Magnenat-Thalmann,et al.  Modelling Multi-Party Interactions among Virtual Characters, Robots, and Humans , 2014, PRESENCE: Teleoperators and Virtual Environments.

[8]  David R. Traum,et al.  Issues in Multiparty Dialogues , 2003, Workshop on Agent Communication Languages.

[9]  Xudong Jiang,et al.  Noise-Resistant Local Binary Pattern With an Embedded Error-Correction Mechanism , 2013, IEEE Transactions on Image Processing.

[10]  Jianxin Wu,et al.  mCENTRIST: A Multi-Channel Feature Generation Mechanism for Scene Categorization , 2014, IEEE Transactions on Image Processing.

[11]  Qianli Xu,et al.  Designing engagement-aware agents for multiparty conversations , 2013, CHI.

[12]  Manuel Giuliani,et al.  How can i help you': comparing engagement classification strategies for a robot bartender , 2013, ICMI '13.

[13]  Adam Kendon,et al.  Spacing and Orientation in Co-present Interaction , 2009, COST 2102 Training School.

[14]  Eric Horvitz,et al.  Dialog in the open world: platform and applications , 2009, ICMI-MLMI '09.

[15]  Xudong Jiang,et al.  LBP-Based Edge-Texture Features for Object Recognition , 2014, IEEE Transactions on Image Processing.

[16]  Xiangyang Wang,et al.  Robust image retrieval based on color histogram of local feature regions , 2010, Multimedia Tools and Applications.