Automatic Role Recognition in Multiparty Recordings: Using Social Affiliation Networks for Feature Extraction

Automatic analysis of social interactions attracts increasing attention in the multimedia community. This letter considers one of the most important aspects of the problem, namely the roles played by individuals interacting in different settings. In particular, this work proposes an automatic approach for the recognition of roles in both production environment contexts (e.g., news and talk-shows) and spontaneous situations (e.g., meetings). The experiments are performed over roughly 90 h of material (one of the largest databases used for role recognition in the literature) and show that the recognition effectiveness depends on how much the roles influence the behavior of people. Furthermore, this work proposes the first approach for modeling mutual dependences between roles and assesses its effect on role recognition performance.

[1]  Julia Hirschberg,et al.  The Rules Behind Roles: Identifying Speaker Role in Radio Broadcasts , 2000, AAAI/IAAI.

[2]  Jean-Marc Odobez,et al.  Predicting two facets of social verticality in meetings from five-minute time slices and nonverbal cues , 2008, ICMI '08.

[3]  Julia Hirschberg,et al.  Automatic summarization of broadcast news using structural features , 2003, INTERSPEECH.

[4]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[5]  Alex Pentland,et al.  Social signals, their function, and automatic analysis: a survey , 2008, ICMI '08.

[6]  Alexander I. Rudnicky,et al.  Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants , 2004, INTERSPEECH.

[7]  Tanja Schultz,et al.  Modeling Vocal Interaction for Text-Independent Participant Characterization in Multi-Party Conversation , 2008, SIGDIAL Workshop.

[8]  Wei-Ta Chu,et al.  Movie Analysis Based on Roles' Social Network , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[9]  Dilek Z. Hakkani-Tür,et al.  Role recognition for meeting participants: an approach based on lexical information and social network analysis , 2008, ACM Multimedia.

[10]  Fabio Pianesi,et al.  A multimodal annotated corpus of consensus decision making meetings , 2007, Lang. Resour. Evaluation.

[11]  Jean Carletta,et al.  The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.

[12]  John Scott What is social network analysis , 2010 .

[13]  Alessandro Vinciarelli,et al.  Speakers Role Recognition in Multiparty Audio Recordings Using Social Network Analysis and Duration Distribution Modeling , 2007, IEEE Transactions on Multimedia.

[14]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[15]  Alessandro Vinciarelli,et al.  Broadcast news story segmentation using social network analysis and hidden markov models , 2007, ACM Multimedia.

[16]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[17]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[18]  Alessandro Vinciarelli Sociometry Based Multiparty Audio Recordings Summarization , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[19]  Stanley Wasserman ASONAM 2010 and OSINT-WM 2010 Invited Keynotes , 2010, ASONAM 2010.

[20]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[21]  Jean Carletta,et al.  The AMI meeting corpus , 2005 .

[22]  Wei-Ta Chu,et al.  RoleNet: Movie Analysis from the Perspective of Social Networks , 2009, IEEE Transactions on Multimedia.

[23]  Alex Pentland,et al.  Social signal processing: state-of-the-art and future perspectives of an emerging domain , 2008, ACM Multimedia.

[24]  H. L. Tischler Introduction to Sociology, 3rd Edition , 1990 .

[25]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[26]  G. O'Connor Small Groups , 1980 .

[27]  Jitendra Ajmera,et al.  Robust audio segmentation , 2004 .

[28]  Yang Liu,et al.  Initial Study on Automatic Identification of Speaker Role in Broadcast News Speech , 2006, NAACL.

[29]  Daniel Gatica-Perez,et al.  Automatic nonverbal analysis of social interaction in small groups: A review , 2009, Image Vis. Comput..

[30]  Jithendra Vepa,et al.  The segmentation of multi-channel meeting recordings for automatic speech recognition , 2006, INTERSPEECH.

[31]  Chuohao Yeo,et al.  Predicting the dominant clique in meetings through fusion of nonverbal cues , 2008, ACM Multimedia.