Identification and Engagement of Passive Subjects in Multiparty Conversations by a Humanoid Robot

In this work, we present a novel human-robot interaction (HRI) method to detect and engage passive subjects in multiparty conversations using a humanoid robot. Voice activity detection and speaker localization are combined with facial recognition to detect and identify non-participating subjects. Once a non-participating individual is identified, the robot addresses the subject with a fact related to the topic of the conversation, with the goal of promoting the subject to join the conversation. To prompt sentences related to the topic of the conversation, automatic speech recognition and natural language processing techniques are employed. Preliminary experiments demonstrate that the method successfully identifies and engages passive subjects in a conversation.

[1]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[2]  Brian Scassellati,et al.  Vulnerable robots positively shape human conversational dynamics in a human–robot team , 2020, Proceedings of the National Academy of Sciences.

[3]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[4]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[7]  Nasser Kehtarnavaz,et al.  A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection , 2018, IEEE Access.

[8]  Tetsunori Kobayashi,et al.  Framework of Communication Activation Robot Participating in Multiparty Conversation , 2010, AAAI Fall Symposium: Dialog with Robots.

[9]  Tetsunori Kobayashi,et al.  Four-participant group conversation: A facilitation robot controlling engagement density as the fourth participant , 2015, Comput. Speech Lang..

[10]  Yuichiro Yoshikawa,et al.  Multiple-Robot Mediated Discussion System to support group discussion * , 2020, 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN).

[11]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Cynthia Breazeal,et al.  Toward sociable robots , 2003, Robotics Auton. Syst..

[13]  Scott E. Hudson,et al.  Towards Robot Autonomy in Group Conversations: Understanding the Effects of Body Orientation and Gaze , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Candace L. Sidner,et al.  Explorations in engagement for humans and robots , 2005, Artif. Intell..

[16]  Catherine Pelachaud,et al.  Engagement in Human-Agent Interaction: An Overview , 2020, Frontiers in Robotics and AI.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.