Modeling Human Communication Dynamics

Human face-to-face communication is a little like a dance, in that participants continuously adjust their behaviors based on verbal and nonverbal cues from the social context. Today’s computers and interactive devices are still lacking many of these human-like abilities to hold fluid and natural interactions. Leveraging recent advances in machine learning, audio-visual signal processing and computational linguistics, my research focuses on creating human-computer interaction (HCI) technologies able to analyze, recognize and predict human subtle communicative behaviors in a social context. I formalize this new research endeavor with a Human Communication Dynamics framework, addressing four key computational challenges: behavioral dynamic, multimodal dynamic, interpersonal dynamic and societal dynamic. Central to this research effort is the introduction of new probabilistic models able to learn the temporal and fine-grained latent dependencies across behaviors, modalities and interlocutors. In this talk, I will present some of our recent achievements modeling multiple aspects of human communication dynamics, motivated by applications in healthcare (depression, PTSD, suicide, autism), education (learning analytics), business (negotiation, interpersonal skills) and social multimedia (opinion mining, social influence). Institute for Creative Technologies, University of Southern California, USA, e-mail: morency@ict.usc.edu Proceedings of 5th International Workshop on Spoken Dialog Systems Napa, January 17-20, 2014 3

[1]  J. Flanagan,et al.  Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[2]  James L. Flanagan,et al.  Autodirective Microphone Systems , 1991 .

[3]  James L. Flanagan,et al.  Integrated information modalities for human/machine communication: HuMaNet, an experimental system for conferencing , 1990, J. Vis. Commun. Image Represent..

[4]  E. C. Wente A Condenser Transmitter as a Uniformly Sensitive Instrument for the Absolute Measurement of Sound Intensity , 1917 .

[5]  Gary W. Elko,et al.  A History of Consumer Microphones: The Electret Condenser Microphone Meets Micro-Electro-Mechanical-Systems , 2009 .

[6]  Gary W. Elko,et al.  Surface-Micromachined MEMS Microphone , 2003 .

[7]  J. Bavelas,et al.  Listeners as co-narrators. , 2000, Journal of personality and social psychology.

[8]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  P. Watzlawick,et al.  Pragmatics of Human Communication: A Study of Interactional Patterns, Pathologies and Paradoxes , 1964 .

[10]  V. Yngve On getting a word in edgewise , 1970 .

[11]  A. Pentland Social Dynamics: Signals and Behavior , 2004 .

[12]  Trevor Darrell,et al.  Head gestures for perceptual interfaces: The role of context in improving recognition , 2007, Artif. Intell..

[13]  Gerhard M. Sessler,et al.  Self‐Biased Condenser Microphone with High Capacitance , 1962 .

[14]  Louis-Philippe Morency,et al.  A probabilistic multimodal approach for predicting listener backchannels , 2009, Autonomous Agents and Multi-Agent Systems.

[15]  Joseph A. DeVito,et al.  The interpersonal communication book , 1969 .

[16]  Nigel G. Ward,et al.  Prosodic features which cue back-channel responses in English and Japanese , 2000 .