EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos

Emotions play a key role in human communication and public presentations. Human emotions are usually expressed through multiple modalities. Therefore, exploring multimodal emotions and their coherence is of great value for understanding emotional expressions in presentations and improving presentation skills. However, manually watching and studying presentation videos is often tedious and time-consuming. There is a lack of tool support to help conduct an efficient and in-depth multi-level analysis. Thus, in this paper, we introduce EmoCo, an interactive visual analytics system to facilitate efficient analysis of emotion coherence across facial, text, and audio modalities in presentation videos. Our visualization system features a channel coherence view and a sentence clustering view that together enable users to obtain a quick overview of emotion coherence and its temporal evolution. In addition, a detail view and word view enable detailed exploration and comparison from the sentence level and word level, respectively. We thoroughly evaluate the proposed system and visualization techniques through two usage scenarios based on TED Talk videos and interviews with two domain experts. The results demonstrate the effectiveness of our system in gaining insights into emotion coherence in presentations.

[1]  David Bamman,et al.  Capturing, Representing, and Interacting with Laughter , 2018, CHI.

[2]  David Suendermann-Oeft,et al.  Evaluating Speech, Face, Emotion and Body Movement Time-series Features for Automated Multimodal Presentation Scoring , 2015, ICMI.

[3]  Erik Cambria,et al.  Towards an intelligent framework for multimodal affective data analysis , 2015, Neural Networks.

[4]  Peter Robinson,et al.  Speech Emotion Classification and Public Speaking Skill Assessment , 2010, HBU.

[5]  N. Ambady,et al.  On Being Consistent: The Role of Verbal–Nonverbal Consistency in First Impressions , 2010 .

[6]  R. Nedunchezhian,et al.  A study on video data mining , 2012, International Journal of Multimedia Information Retrieval.

[7]  Seth Flaxman,et al.  Multimodal Sentiment Analysis To Explore the Structure of Emotions , 2018, KDD.

[8]  Thierry Pun,et al.  Multimodal Emotion Recognition in Response to Videos , 2012, IEEE Transactions on Affective Computing.

[9]  Shyh-Kang Jeng,et al.  Emotion-Based Music Visualization Using Photos , 2008, MMM.

[10]  Gary K. L. Tam,et al.  Visualization of Time‐Series Data in Parameter Space for Understanding Facial Dynamics , 2011, Comput. Graph. Forum.

[11]  Michelle L. Gregory,et al.  User-directed Sentiment Analysis: Visualizing the Affective Content of Documents , 2006 .

[12]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[13]  Fei Wang,et al.  PEARL: An interactive visual analytic tool for understanding personal emotion style derived from social media , 2014, 2014 IEEE Conference on Visual Analytics Science and Technology (VAST).

[14]  George Trigeorgis,et al.  End-to-End Multimodal Emotion Recognition Using Deep Neural Networks , 2017, IEEE Journal of Selected Topics in Signal Processing.

[15]  Mohan S. Kankanhalli,et al.  Multimedia data mining: state of the art and challenges , 2010, Multimedia Tools and Applications.

[16]  G. Horstmann,et al.  Coherence between Emotion and Facial Expression: Evidence from Laboratory Experiments , 2013 .

[17]  Y. Trope,et al.  Body Cues, Not Facial Expressions, Discriminate Between Intense Positive and Negative Emotions , 2012, Science.

[18]  Carmine Gallo Talk Like TED: The 9 Public-Speaking Secrets of the World's Top Minds , 2001 .

[19]  Daniel A. Keim,et al.  Visual opinion analysis of customer feedback data , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[20]  Claudiu Cristian Musat,et al.  EmotionWatch: Visualizing Fine-Grained Emotions in Event-Related Tweets , 2014, ICWSM.

[21]  Nan Cao,et al.  A Semantic-Based Method for Visualizing Large Image Collections , 2019, IEEE Transactions on Visualization and Computer Graphics.

[22]  Katarzyna Wac,et al.  Multimodal Integration of Emotional Signals from Voice, Body, and Context: Effects of (In)Congruence on Emotion Recognition and Attitudes Towards Robots , 2019, Int. J. Soc. Robotics.

[23]  Mario Schmidt,et al.  The Sankey Diagram in Energy and Material Flow Management , 2008 .

[24]  Hong Zhou,et al.  OpinionSeer: Interactive Visualization of Hotel Customer Feedback , 2010, IEEE Transactions on Visualization and Computer Graphics.

[25]  Duy-Dinh Le,et al.  Visual Analytics of Political Networks From Face-Tracking of News Video , 2016, IEEE Transactions on Multimedia.

[26]  R. Plutchik Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice , 2016 .

[27]  S. R. Livingstone,et al.  The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English , 2018, PloS one.

[28]  Pierre Dragicevic,et al.  Time Curves: Folding Time to Visualize Patterns of Temporal Evolution in Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[29]  Louis-Philippe Morency,et al.  Multimodal Sentiment Intensity Analysis in Videos: Facial Gestures and Verbal Messages , 2016, IEEE Intelligent Systems.

[30]  C. Darwin The Expression of the Emotions in Man and Animals , .

[31]  Andreas Dieberger,et al.  Hierarchical brushing in a collection of video data , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[32]  J. Fernández-Dols,et al.  Emotion and Expression: Naturalistic Studies , 2013 .

[33]  Gunther Heidemann,et al.  Interactive Schematic Summaries for Faceted Exploration of Surveillance Video , 2013, IEEE Transactions on Multimedia.

[34]  João Barroso,et al.  Performance analysis of Microsoft's and Google's Emotion Recognition API using pose-invariant faces , 2018, DSAI.

[35]  Huamin Qu,et al.  Multimodal Analysis of Video Collections: Visual Exploration of Presentation Techniques in TED Talks , 2020, IEEE Transactions on Visualization and Computer Graphics.

[36]  Keita Higuchi,et al.  EgoScanning: Quickly Scanning First-Person Videos with Egocentric Elastic Timelines , 2017, CHI.

[37]  Sethuraman Panchanathan,et al.  Multimodal emotion recognition using deep learning architectures , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[38]  Daniel A. Keim,et al.  Bring It to the Pitch: Combining Video and Movement Data to Enhance Team Sport Analysis , 2018, IEEE Transactions on Visualization and Computer Graphics.

[39]  Florian Heimerl,et al.  Visual Movie Analytics , 2016, IEEE Transactions on Multimedia.

[40]  S. Vijayarani,et al.  MULTIMEDIA MINING RESEARCH - AN OVERVIEW , 2015 .

[41]  Erik Cambria,et al.  A review of affective computing: From unimodal analysis to multimodal fusion , 2017, Inf. Fusion.

[42]  V. Reid,et al.  Coherent emotional perception from body expressions and the voice , 2016, Neuropsychologia.

[43]  Tovi Grossman,et al.  Video lens: rapid playback and exploration of large video collections and associated metadata , 2014, UIST.

[44]  Tieniu Tan,et al.  Affective Computing: A Review , 2005, ACII.

[45]  Frank Schneider,et al.  Incongruence effects in crossmodal emotional integration , 2011, NeuroImage.

[46]  Peter Robinson,et al.  Real-Time Recognition of Affective States from Nonverbal Features of Speech and Its Application for Public Speaking Skill Analysis , 2011, IEEE Transactions on Affective Computing.

[47]  Stefan Wermter,et al.  Developing crossmodal expression recognition based on a deep neural model , 2016, Adapt. Behav..

[48]  Chaomei Chen,et al.  Visual Analysis of Conflicting Opinions , 2006, 2006 IEEE Symposium On Visual Analytics Science And Technology.

[49]  Pourang Irani,et al.  Interactive Exploration of Surveillance Video through Action Shot Summarization and Trajectory Visualization , 2013, IEEE Transactions on Visualization and Computer Graphics.