Dynamic character graph via online face clustering for movie analysis

An effective approach to automated movie content analysis involves building a network (graph) of its characters. Existing work usually builds a static character graph to summarize the content using metadata, scripts or manual annotations. We propose an unsupervised approach to building a dynamic character graph that captures the temporal evolution of character interaction. We refer to this as the character interaction graph (CIG). Our approach has two components: (i) an online face clustering algorithm that discovers the characters in the video stream as they appear, and (ii) simultaneous creation of a CIG using the temporal dynamics of the resulting clusters. We demonstrate the usefulness of the CIG for two movie analysis tasks: narrative structure (acts) segmentation and major character retrieval. Our evaluation on full-length movies containing more than 5000 face tracks shows that the proposed approach achieves superior performance for both the tasks.

[1]  S. Field Screenplay: The Foundations of Screenwriting , 1979 .

[2]  M. Saquib Sarfraz,et al.  A Simple and Effective Technique for Face Clustering in TV Series , 2017 .

[3]  GeunSik Jo,et al.  Social network analysis in a movie using character-net , 2012, Multimedia Tools and Applications.

[4]  Chiranjib Bhattacharyya,et al.  Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[6]  Tanaya Guha,et al.  Computationally deconstructing movie narratives: An informatics approach , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Cordelia Schmid,et al.  Unsupervised metric learning for face identification in TV video , 2011, 2011 International Conference on Computer Vision.

[8]  Dong Xu,et al.  Weighted Block-Sparse Low Rank Representation for Face Clustering in Videos , 2014, ECCV.

[9]  Wei-Ta Chu,et al.  RoleNet: Movie Analysis from the Perspective of Social Networks , 2009, IEEE Transactions on Multimedia.

[10]  Robert McKee,et al.  Story: Substance, Structure, Style, and the Principles of Screenwriting , 1997 .

[11]  Rama Chellappa,et al.  Face Association across Unconstrained Video Frames Using Conditional Random Fields , 2012, ECCV.

[12]  Ying Li,et al.  Content-based movie analysis and indexing based on audiovisual cues , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Jai E. Jung,et al.  CoCharNet: Extracting Social Networks using Character Co-occurrence in Movies , 2015, J. Univers. Comput. Sci..

[14]  Chia-Hung Yeh,et al.  Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques , 2006, IEEE Signal Processing Magazine.

[15]  Erica Klarreich,et al.  Hello, my name is… , 2014, CACM.

[16]  Xiaoou Tang,et al.  Joint Face Representation Adaptation and Clustering in Videos , 2016, ECCV.

[17]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[18]  Qiang Ji,et al.  Constrained Clustering and Its Application to Face Clustering in Videos , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[20]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Tanaya Guha,et al.  An Online Algorithm for Constrained Face Clustering in Videos , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[22]  Qiang Ji,et al.  Simultaneous Clustering and Tracklet Linking for Multi-face Tracking in Videos , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Yihong Gong,et al.  Deep Metric Learning with Improved Triplet Loss for Face Clustering in Videos , 2016, PCM.

[24]  Shrikanth S. Narayanan,et al.  Linguistic analysis of differences in portrayal of movie characters , 2017, ACL.

[25]  Stefan Sharff The Elements of Cinema: Toward a Theory of Cinesthetic Impact , 1982 .

[26]  Andrew Zisserman,et al.  Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video , 2006, BMVC.

[27]  Mei-Chen Yeh,et al.  Clustering Faces in Movies Using an Automatically Constructed Social Network , 2014, IEEE MultiMedia.

[28]  Chiranjib Bhattacharyya,et al.  Temporally Coherent Chinese Restaurant Process for Discovery of Persons and Corresponding Tracklets from User-generated Videos , 2014, ArXiv.

[29]  Qiang Ji,et al.  A Coupled Hidden Markov Random Field model for simultaneous face clustering and tracking in videos , 2017, Pattern Recognit..

[30]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Xiaochun Cao,et al.  Constrained Multi-View Video Face Clustering , 2015, IEEE Transactions on Image Processing.