Social Intelligence Modeling using Wearable Devices

Social Signal Processing techniques have given the opportunity to analyze in-depth human behavior in social face-to-face interactions. With recent advancements, it is henceforth possible to use these techniques to augment social interactions, especially the human behavior in oral presentations. The goal of this paper is to train a computational model able to provide a relevant feedback to a public speaker concerning his coverbal communication. Hence, the role of this model is to augment the social intelligence of the orator and then the relevance of his presentation. To this end, we present an original interaction setting in which the speaker is equipped with only wearable devices. Several coverbal modalities have been extracted and automatically annotated namely speech volume, intonation, speech rate, eye gaze, hand gestures and body movements. An offline report was addressed to participants containing the performance scores on the overall modalities. In addition, a post-experiment study was conducted to collect participant's opinions on many aspects of the studied interaction and the results were rather positive. Moreover, we annotated recommended feedbacks for each presentation session, and to retrieve these annotations, a Dynamic Bayesian Network model was trained using as inputs the multimodal performance scores. We will show that our assessment behavior model presents good performances compared to other models.

[1]  Lilyan Wilder 7 steps to fearless speaking , 1999 .

[2]  Daniel Gatica-Perez,et al.  Automatic nonverbal analysis of social interaction in small groups: A review , 2009, Image Vis. Comput..

[3]  Shelley Masion Rosenberg Bodily Communication , 1978 .

[4]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[5]  Gérard Bailly,et al.  Learning multimodal behavioral models for face-to-face social interaction , 2015, Journal on Multimodal User Interfaces.

[6]  T. Chartrand,et al.  The Chameleon Effect as Social Glue: Evidence for the Evolutionary Significance of Nonconscious Mimicry , 2003 .

[7]  M. Knapp,et al.  Nonverbal communication in human interaction , 1972 .

[8]  Richard G. Lyons,et al.  Understanding Digital Signal Processing , 1996 .

[9]  R. Birdwhistell Kinesics and Context: Essays on Body Motion Communication , 1971 .

[10]  Brent Lance,et al.  The Rickel Gaze Model: A Window on the Mind of a Virtual Human , 2007, IVA.

[11]  Bilge Mutlu,et al.  Learning-Based Modeling of Multimodal Behaviors for Humanlike Robots , 2014, 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[12]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems - Exact Computational Methods for Bayesian Networks , 1999, Information Science and Statistics.

[13]  A. Pentland,et al.  Meeting mediator: enhancing group collaborationusing sociometric feedback , 2008, CSCW.

[14]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[15]  Peter Wittenburg,et al.  Annotation by Category: ELAN and ISO DCR , 2008, LREC.

[16]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[17]  Torsten Wörtwein,et al.  Exploring feedback strategies to improve public speaking: an interactive virtual audience framework , 2015, UbiComp.

[18]  J. Devito The essential elements of public speaking , 2003 .

[19]  Rebecca Hincks,et al.  Processing the prosody of oral presentations , 2004 .

[20]  Gérard Bailly,et al.  Graphical models for social behavior modeling in face-to face interaction , 2016, Pattern Recognit. Lett..

[21]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[22]  Nivja H. Jong,et al.  Praat script to detect syllable nuclei and measure speech rate automatically , 2009, Behavior research methods.

[23]  H. Pashler Dual-task interference in simple tasks: data and theory. , 1994, Psychological bulletin.

[24]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[25]  Bilge Mutlu,et al.  MACH: my automated conversation coach , 2013, UbiComp.

[26]  T. Hacki,et al.  Comparative speaking, shouting and singing voice range profile measurement: physiological and pathological aspects. , 1996, Logopedics, phoniatrics, vocology.

[27]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[28]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[29]  Rebecca Hincks,et al.  Measures and perceptions of liveliness in student oral presentation speech: A proposal for an automatic feedback mechanism , 2005 .

[30]  Mohammed E. Hoque,et al.  Rhema: A Real-Time In-Situ Intelligent Interface to Help People with Public Speaking , 2015, IUI.

[31]  Laura K. Guerrero,et al.  Nonverbal Communication in Close Relationships , 2005 .

[32]  Peter van Rosmalen,et al.  Presentation Trainer, your Public Speaking Multimodal Coach , 2015, ICMI.

[33]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[34]  Frank Drews,et al.  A Comparison of the Cell Phone Driver and the Drunk Driver , 2004, Hum. Factors.

[35]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[36]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[37]  Mohan S. Kankanhalli,et al.  Multi-sensor Self-Quantification of Presentations , 2015, ACM Multimedia.

[38]  K. Albrecht Social Intelligence: The New Science of Success , 2005 .

[39]  Nir Friedman,et al.  Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning , 2009 .

[40]  Roger Griffiths,et al.  Speech Rate and NNS Comprehension: A Preliminary Study in Time‐Benefit Analysis* , 1990 .

[41]  Rosalind W. Picard,et al.  Rich Nonverbal Sensing Technology for Automated Social Skills Training , 2014, Computer.

[42]  Hung-Hsuan Huang,et al.  Predicting Influential Statements in Group Discussions using Speech and Head Motion Information , 2014, ICMI.

[43]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[44]  R. Shiffrin,et al.  Visual processing capacity and attentional control. , 1972, Journal of experimental psychology.

[45]  S B Fawcett,et al.  Training public-speaking behavior: an experimental analysis and social validation. , 1975, Journal of applied behavior analysis.

[46]  Johannes Schöning,et al.  Augmenting Social Interactions: Realtime Behavioural Feedback using Social Signal Processing Techniques , 2015, CHI.

[47]  Patrick C. Kyllonen,et al.  Measurement of 21st Century Skills Within the Common Core State Standards , 2012 .

[48]  François Pellegrino,et al.  The perception of intended speech rate in English, French, and German by French listeners , 2006 .