A Sequential Data Analysis Approach to Detect Emergent Leaders in Small Groups

This paper addresses the problem of predicting emergent leaders (ELs) in small groups, that is, meetings. This is a long-lasting research problem for social and organizational psychology and a relevant problem that recently gained momentum in social computing. Toward this goal, we propose a novel method, which analyzes the temporal dependencies of the audio-visual data by applying unsupervised deep learning generative models (feature learning). To the best of our knowledge, this is the first attempt that sequential data processing is performed for EL detection. Feature learning results in a single feature vector per a given time interval and all feature vectors representing a participant are aggregated using novel fusion techniques. Finally, the EL detection is performed using the state-of-the-art single and multiple kernel learning algorithms. The proposed method shows (significantly) improved results compared to the state-of-the-art methods and it can be adapted to analyze various small group interactions given that it is a general approach.

[1]  Kyriaki Kalimeri,et al.  Modeling dominance effects on nonverbal behaviors using granger causality , 2012, ICMI '12.

[2]  Steve Renals,et al.  Multimodal Integration for Meeting Group Action Segmentation and Recognition , 2005, MLMI.

[3]  Hervé Bourlard,et al.  Automatic Recognition of Emergent Social Roles in Small Group Interactions , 2015, IEEE Transactions on Multimedia.

[4]  Fatih Murat Porikli,et al.  Covariance Tracking using Model Update Based on Lie Algebra , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Gerald Friedland,et al.  Estimating Dominance in Multi-Party Meetings Using Speaker Diarization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Vittorio Murino,et al.  Detecting emergent leader in a meeting environment using nonverbal visual features only , 2016, ICMI.

[7]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[8]  Vittorio Murino,et al.  Moving as a Leader: Detecting Emergent Leadership in Small Groups using Body Pose , 2017, ACM Multimedia.

[9]  Daniel Gatica-Perez,et al.  Emergent leaders through looking and speaking: from audio-visual data to multimodal recognition , 2012, Journal on Multimodal User Interfaces.

[10]  Vittorio Murino,et al.  Identification of emergent leaders in a meeting scenario using multiple kernel learning , 2016, ASSP4MI '16.

[11]  Geoffrey E. Hinton,et al.  Two Distributed-State Models For Generating High-Dimensional Time Series , 2011, J. Mach. Learn. Res..

[12]  Dairazalia Sanchez-Cortes,et al.  Computational Methods for Audio-Visual Analysis of Emergent Leadership in Teams , 2013 .

[13]  Samy Bengio,et al.  Automatic analysis of multimodal group actions in meetings , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[15]  Oya Aran,et al.  Predicting the Performance in Decision-Making Tasks: From Individual Cues to Group Interaction , 2016, IEEE Transactions on Multimedia.

[16]  Alessandro Vinciarelli,et al.  Automatic role recognition in multiparty recordings using social networks and probabilistic sequential models , 2009, ACM Multimedia.

[17]  Peter Robinson,et al.  Constrained Local Neural Fields for Robust Facial Landmark Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[18]  Geoffrey E. Hinton,et al.  Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Alessandro Vinciarelli,et al.  Role recognition in multiparty recordings using social affiliation networks and discrete distributions , 2008, ICMI '08.

[20]  Vittorio Murino,et al.  Prediction of the Leadership Style of an Emergent Leader Using Audio and Visual Nonverbal Features , 2018, IEEE Transactions on Multimedia.

[21]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[22]  D. Gática-Pérez,et al.  A Nonverbal Behavior Approach to Identify Emergent Leaders in Small Groups , 2012, IEEE Transactions on Multimedia.

[23]  Daniel Gatica-Perez,et al.  Detection and application of influence rankings in small group meetings , 2006, ICMI '06.

[24]  Alex Pentland,et al.  Using the influence model to recognize functional roles in meetings , 2007, ICMI '07.

[25]  Fabio Valente,et al.  Understanding social signals in multi-party conversations: Automatic recognition of socio-emotional roles in the AMI meeting corpus , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[26]  Xuelong Li,et al.  Gabor-Based Region Covariance Matrices for Face Recognition , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[27]  Fatih Murat Porikli,et al.  Human Detection via Classification on Riemannian Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Ethem Alpaydin,et al.  Localized multiple kernel learning , 2008, ICML '08.

[29]  Samy Bengio,et al.  Detecting group interest-level in meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[30]  Geoffrey E. Hinton,et al.  Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[31]  Mohamed R. Amer,et al.  Deep Multimodal Fusion: A Hybrid Approach , 2017, International Journal of Computer Vision.

[32]  Samy Bengio,et al.  Modeling individual and group actions in meetings with layered HMMs , 2006, IEEE Transactions on Multimedia.

[33]  Janusz Konrad,et al.  Action Recognition Using Sparse Representation on Covariance Manifolds of Optical Flow , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[34]  Stephen J. Maybank,et al.  Human Action Recognition under Log-Euclidean Riemannian Metric , 2009, ACCV.

[35]  Mohamed R. Amer,et al.  Human Social Interaction Modeling Using Temporal Deep Networks , 2015, ArXiv.

[36]  Alex Pentland,et al.  Automatic Modeling of Dominance Effects Using Granger Causality , 2011, HBU.

[37]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[38]  Marwan Torki,et al.  Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations , 2013, IJCAI.

[39]  Daniel Gatica-Perez,et al.  Identifying emergent leadership in small groups using nonverbal communicative cues , 2010, ICMI-MLMI '10.

[40]  Hiroshi Sawada,et al.  Automatic inference of cross-modal nonverbal interactions in multiparty conversations: "who responds to whom, when, and how?" from gaze, head gestures, and utterances , 2007, ICMI '07.

[41]  Steve Renals,et al.  Dynamic Bayesian networks for meeting structuring , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[42]  Steve Renals,et al.  Automatic Meeting Segmentation Using Dynamic Bayesian Networks , 2007, IEEE Transactions on Multimedia.

[43]  Fabio Valente,et al.  Social Signal Processing: Understanding Nonverbal Communication in Social Interactions , 2010 .

[44]  Junji Yamato,et al.  A probabilistic inference of multiparty-conversation structure based on Markov-switching models of gaze patterns, head directions, and utterances , 2005, ICMI '05.

[45]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[46]  Hervé Bourlard,et al.  Automatic social role recognition in professional meetings using conditional random fields , 2013, INTERSPEECH.

[47]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Alex Pentland,et al.  Modeling Functional Roles Dynamics in Small Group Interactions , 2013, IEEE Transactions on Multimedia.

[49]  Hiroshi Murase,et al.  Quantifying interpersonal influence in face-to-face conversations based on visual attention patterns , 2006, CHI Extended Abstracts.

[50]  E. Berscheid Joining Together: Group Theory and Group Skills , 1976 .

[51]  Alexander H. Waibel,et al.  Modeling focus of attention for meeting indexing based on multiple cues , 2002, IEEE Trans. Neural Networks.

[52]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.