An interdisciplinary taxonomy of social cues and signals in the service of engineering robotic social intelligence

Understanding intentions is a complex social-cognitive task for humans, let alone machines. In this paper we discuss how the developing field of Social Signal Processing, and assessing social cues to interpret social signals, may help to develop a foundation for robotic social intelligence. We describe a taxonomy to further R&D in HRI and facilitate natural interactions between humans and robots. This is based upon an interdisciplinary framework developed to integrate: (1) the sensors used for detecting social cues, (2) the parameters for differentiating and classifying differing levels of those cues, and (3) how sets of social cues indicate specific social signals. This is necessarily an iterative process, as technologies improve and social science researchers better understand the complex interactions of vast quantities of social cue combinations. As such, the goal of this paper is to advance a taxonomy of this nature to further stimulate interdisciplinary collaboration in the development of advanced social intelligence that mutually informs areas of robotic perception and intelligence.

[1]  Thomas Fincannon,et al.  Evidence of the need for social intelligence in rescue robots , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[2]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[3]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[4]  Ehud Sharlin,et al.  Designing Social Greetings and Proxemics in Human Robot Interaction , 2013 .

[5]  Kerstin Dautenhahn,et al.  From embodied to socially embedded agents – Implications for interaction-aware robots , 2002, Cognitive Systems Research.

[6]  Gwen Littlewort,et al.  Toward Practical Smile Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  S. Mohamed Mansoor Roomi,et al.  A Review of Face Recognition Methods , 2013, Int. J. Pattern Recognit. Artif. Intell..

[8]  Stephen M. Fiore,et al.  Towards Modeling Social-Cognitive Mechanisms in Robots to Facilitate Human-Robot Teaming , 2013, Proceedings of the Human Factors and Ergonomics Society Annual Meeting.

[9]  Dirk Heylen,et al.  Conceptual frameworks for multimodal social signal processing , 2012, Journal on Multimodal User Interfaces.

[10]  Zhengyou Zhang,et al.  Microsoft Kinect Sensor and Its Effect , 2012, IEEE Multim..

[11]  Eliot R. Smith,et al.  SOCIALLY SITUATED COGNITION IN PERSPECTIVE , 2013 .

[12]  David Harris,et al.  Engineering Psychology and Cognitive Ergonomics. Applications and Services , 2013, Lecture Notes in Computer Science.

[13]  I.A. Essa,et al.  Ubiquitous sensing for smart and aware environments , 2000, IEEE Wirel. Commun..

[14]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[15]  Kerstin Dautenhahn,et al.  An Autonomous Proxemic System for a Mobile Companion Robot , 2010 .

[16]  Junle Wang,et al.  Quantifying the relationship between visual salience and visual importance , 2010, Electronic Imaging.

[17]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  E. Hall The Silent Language , 1959 .

[19]  Beat Fasel,et al.  Automati Fa ial Expression Analysis: A Survey , 1999 .

[20]  Didem Gökçay,et al.  Prediction of Affective States through Non-invasive Thermal Camera and EMG Recordings , 2011, ACII.

[21]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[22]  Xiao Wang,et al.  A human detection system for proxemics interaction , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[23]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[24]  Nicolai Marquardt,et al.  Proxemic interactions: the new ubicomp? , 2011, INTR.

[25]  T. Shibata,et al.  Robot Therapy: A New Approach for Mental Healthcare of the Elderly – A Mini-Review , 2010, Gerontology.

[26]  Craig A. Smith,et al.  From appraisal to emotion: Differences among unpleasant feelings , 1988 .

[27]  Mary Ritchie Key Nonverbal communication: A research guide & bibliography , 1977 .

[28]  Hongtu Zhu,et al.  An affective circumplex model of neural systems subserving valence, arousal, and cognitive overlay during the appraisal of emotional faces , 2008, Neuropsychologia.

[29]  Ahmad Lotfi,et al.  Teleoperation through eye gaze (TeleGaze): A multimodal approach , 2009, 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[30]  Klaus R. Scherer,et al.  A psycho-ethological approach to social signal processing , 2012, Cognitive Processing.

[31]  Fumiya Iida,et al.  50 Years of Artificial Intelligence, Essays Dedicated to the 50th Anniversary of Artificial Intelligence , 2007, 50 Years of Artificial Intelligence.

[32]  Romina Kühn,et al.  A Prototyping and Evaluation Framework for Interactive Ubiquitous Systems , 2013, HCI.

[33]  R. Pfeifer,et al.  Self-Organization, Embodiment, and Biologically Inspired Robotics , 2007, Science.

[34]  S Najarian,et al.  Advances in medical robotic systems with specific applications in surgery—a review , 2011, Journal of medical engineering & technology.

[35]  Florian Jentsch,et al.  From Tools to Teammates , 2011 .

[36]  A. Darzi,et al.  Robots and Service Innovation in Health Care , 2011, Journal of health services research & policy.

[37]  Shengsheng Yu,et al.  A Survey of Face Detection, Extraction and Recognition , 2003, Comput. Artif. Intell..

[38]  Gwen Littlewort,et al.  The computer expression recognition toolbox (CERT) , 2011, Face and Gesture 2011.

[39]  Cynthia Breazeal,et al.  Collaboration in Human-Robot Teams , 2004, AIAA 1st Intelligent Systems Technical Conference.

[40]  Maria Pateraki,et al.  Two people walk into a bar: dynamic multi-party social interaction with a robot agent , 2012, ICMI '12.

[41]  Qiang Ji,et al.  In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Ben J. A. Kröse,et al.  Accompany: Acceptable robotiCs COMPanions for AgeiNG Years — Multidimensional aspects of human-system interactions , 2013, 2013 6th International Conference on Human System Interactions (HSI).

[43]  Claire L. Roether,et al.  Critical features for the perception of emotion from gait. , 2009, Journal of vision.

[44]  Hiran Ekanayake,et al.  Assessing Performance in Training Games , 2011 .

[45]  V. Vijayakumari,et al.  Face Recognition Techniques: A Survey , 2013, International Journal of Advanced Trends in Computer Science and Engineering.

[46]  Brigitte Zellner,et al.  Pauses and the temporal structure of speech , 1995 .

[47]  Maja Pantic,et al.  Meta-Analysis of the First Facial Expression Recognition Challenge , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[48]  Tom Ziemke,et al.  Social Situatedness of Natural and Artificial Intelligence: Vygotsky and Beyond , 2003, Adapt. Behav..

[49]  Mary Hayhoe,et al.  Real-time recording and classification of eye movements in an immersive virtual environment. , 2013, Journal of vision.

[50]  Roderick Murray-Smith,et al.  Virtual sensors: rapid prototyping of ubiquitous interaction with a mobile phone and a Kinect , 2011, Mobile HCI.

[51]  Cheng-Ning Huang,et al.  The Review of Applications and Measurements in Facial Electromyography , 2004 .

[52]  Fabio Babiloni,et al.  On the Use of Electrooculogram for Efficient Human Computer Interfaces , 2009, Comput. Intell. Neurosci..

[53]  Robin R. Murphy,et al.  Survey of Non-facial/Non-verbal Affective Expressions for Appearance-Constrained Robots , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[54]  J. Mccroskey,et al.  Nonverbal Behavior in Interpersonal Relations , 1987 .

[55]  Isabella Poggi,et al.  Social signals: from theory to applications , 2012, Cognitive Processing.

[56]  Hiroshi G. Okuno,et al.  Real-Time Tracking of Multiple Sound Sources by Integration of In-Room and Robot-Embedded Microphone Arrays , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[57]  Britta Wrede,et al.  Towards a typology of meaningful signals and cues in social robotics , 2011, 2011 RO-MAN.

[58]  Per Ola Kristensson,et al.  Multi-view proxemics: distance and position sensitive interaction , 2013, PerDis.

[59]  Brian Scassellati,et al.  A Context-Dependent Attention System for a Social Robot , 1999, IJCAI.

[60]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[61]  Travis J. Wiltshire,et al.  Toward understanding social cues and signals in human–robot interaction: effects of robot gaze and proxemic behavior , 2013, Front. Psychol..

[62]  Guy J. Brown,et al.  Separation of Speech by Computational Auditory Scene Analysis , 2005 .

[63]  L. G. Weiss,et al.  Autonomous robots in the fog of war , 2011, IEEE Spectrum.

[64]  Ruigang Yang,et al.  Fusion of time-of-flight depth and stereo for high accuracy depth maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  P. Ekman,et al.  What the face reveals : basic and applied studies of spontaneous expression using the facial action coding system (FACS) , 2005 .

[66]  G. Fink,et al.  It's in your eyes--using gaze-contingent stimuli to create truly interactive paradigms for social cognitive and affective neuroscience. , 2010, Social cognitive and affective neuroscience.

[67]  Chen-Yu Chan,et al.  Simultaneous localization of mobile robot and multiple sound sources using microphone array , 2009, 2009 IEEE International Conference on Robotics and Automation.

[68]  Alex Pentland Socially Aware Computation and Communication , 2005, Computer.

[69]  Rama Chellappa,et al.  Identification of humans using gait , 2004, IEEE Transactions on Image Processing.

[70]  Patrick J. Flynn,et al.  A survey of approaches and challenges in 3D and multi-modal 3D + 2D face recognition , 2006, Comput. Vis. Image Underst..

[71]  Marco Costa,et al.  Interpersonal Distances in Group Walking , 2010 .

[72]  Alexander Zelinsky,et al.  Intuitive Human-Robot Interaction Through Active 3D Gaze Tracking , 2003, ISRR.

[73]  Enzo Pasquale Scilingo,et al.  SMART TEXTILES FOR WEARABLE MOTION CAPTURE SYSTEMS , 2002 .

[74]  Carlos Hitoshi Morimoto,et al.  Towards Wearable Gaze Supported Augmented Cognition , 2013 .

[75]  Florian Jentsch,et al.  An Overview of Humans and Autonomy for Military Environments: Safety, Types of Autonomy, Agents, and User Interfaces , 2013, HCI.

[76]  Daniel Gatica-Perez,et al.  Automatic nonverbal analysis of social interaction in small groups: A review , 2009, Image Vis. Comput..

[77]  Michael J. Richardson,et al.  Toward a radically embodied, embedded social psychology , 2009 .

[78]  Maja J. Mataric,et al.  Motion capture from inertial sensing for untethered humanoid teleoperation , 2004, 4th IEEE/RAS International Conference on Humanoid Robots, 2004..

[79]  Michelle Karg,et al.  Recognition of Affect Based on Gait Patterns , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[80]  Giulio Sandini,et al.  The iCub Cognitive Humanoid Robot: An Open-System Research Platform for Enactive Cognition , 2006, 50 Years of Artificial Intelligence.

[81]  Wei Chen,et al.  Experience the world with archetypal symbols: a new form of esthetics. , 2013 .

[82]  Wonha Kim,et al.  Video Processing for Human Perceptual Visual Quality-Oriented Video Coding , 2013, IEEE Transactions on Image Processing.

[83]  Stephen M. Fiore,et al.  Cognitive architecture for Perception-Reaction Intelligent Computer Agents (CAPRICA) , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[84]  Roy Want,et al.  Bridging physical and virtual worlds with electronic tags , 1999, CHI '99.

[85]  Helena Lindgren,et al.  Proxemics Awareness in Kitchen As-a-Pal: Tracking Objects and Human in Perspective , 2013, 2013 9th International Conference on Intelligent Environments.

[86]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[87]  Kenji Suzuki,et al.  Emotionally Assisted Human–Robot Interaction Using a Wearable Device for Reading Facial Expressions , 2012, Adv. Robotics.

[88]  Vittorio Murino,et al.  Towards Computational Proxemics: Inferring Social Relations from Interpersonal Distances , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[89]  Alex Pentland,et al.  Using the influence model to recognize functional roles in meetings , 2007, ICMI '07.

[90]  Per Ola Kristensson,et al.  The potential of fusing computer vision and depth sensing for accurate distance estimation , 2013, CHI Extended Abstracts.

[91]  Robert Carson,et al.  Motion Capture , 2009, Encyclopedia of Biometrics.

[92]  Andrea Lockerd Thomaz,et al.  Effects of nonverbal communication on efficiency and robustness in human-robot teamwork , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[93]  Yasunari Yoshitomi,et al.  Effect of sensor fusion for recognition of emotional states using voice, face image and thermal image of face , 2000, Proceedings 9th IEEE International Workshop on Robot and Human Interactive Communication. IEEE RO-MAN 2000 (Cat. No.00TH8499).

[94]  Toshiharu Mukai,et al.  3D sound source localization system based on learning of binaural hearing , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[95]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[96]  E. Jovanov,et al.  Avatar — A multi-sensory system for real time body position monitoring , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[97]  Ray L. Birdwhistell Nonverbal Communication: A Research Guide and Bibliography:Nonverbal Communication: A Research Guide and Bibliography. , 1978 .

[98]  Ben J. A. Kröse,et al.  BIRON, where are you? Enabling a robot to learn new places in a real home environment by integrating spoken dialog and visual localization , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[99]  Sandra P Marshall,et al.  Identifying cognitive state from eye metrics. , 2007, Aviation, space, and environmental medicine.

[100]  Maja J. Mataric,et al.  Automated Proxemic Feature Extraction and Behavior Recognition: Applications in Human-Robot Interaction , 2013, Int. J. Soc. Robotics.

[101]  Samy Bengio,et al.  Modeling human interaction in meetings , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[102]  Mike Y. Chen,et al.  RealSense: directional interaction for proximate mobile sharing using built-in orientation sensors , 2013, MM '13.

[103]  Michael A Knodler,et al.  Are Driving Simulators Effective Tools for Evaluating Novice Drivers' Hazard Anticipation, Speed Management, and Attention Maintenance Skills. , 2010, Transportation research. Part F, Traffic psychology and behaviour.

[104]  Takeshi Ninomiya,et al.  Wearable pupil position detection system utilizing dye-sensitized photovoltaic devices , 2008 .

[105]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[106]  Giovanni Pezzulo,et al.  The “Interaction Engine”: A Common Pragmatic Competence Across Linguistic and Nonlinguistic Interactions , 2012, IEEE Transactions on Autonomous Mental Development.

[107]  J. M. Gilbert,et al.  Silent speech interfaces , 2010, Speech Commun..