Character Identification in Feature-Length Films Using Global Face-Name Matching

Identification of characters in films, although very intuitive to humans, still poses a significant challenge to computer methods. In this paper, we investigate the problem of identifying characters in feature-length films using video and film script. Different from the state-of-the-art methods on naming faces in the videos, most of which used the local matching between a visible face and one of the names extracted from the temporally local video transcript, we attempt to do a global matching between names and clustered face tracks under the circumstances that there are not enough local name cues that can be found. The contributions of our work include: 1) A graph matching method is utilized to build face-name association between a face affinity network and a name affinity network which are, respectively, derived from their own domains (video and script). 2) An effective measure of face track distance is presented for face track clustering. 3) As an application, the relationship between characters is mined using social network analysis. The proposed framework is able to create a new experience on character-centered film browsing. Experiments are conducted on ten feature-length films and give encouraging results.

[1]  Andrew Zisserman,et al.  Person Spotting: Video Shot Retrieval for Face Sets , 2005, CIVR.

[2]  Andrew Zisserman,et al.  Automatic face recognition for film character retrieval in feature-length films , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Andrew Zisserman,et al.  Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video , 2006, BMVC.

[4]  Andrew W. Fitzgibbon,et al.  On Affine Invariant Clustering and Automatic Cast Listing in Movies , 2002, ECCV.

[5]  Witold Pedrycz,et al.  Face recognition: A study in information fusion using fuzzy integral , 2005, Pattern Recognit. Lett..

[6]  Rong Yan,et al.  Multiple instance learning for labeling faces in broadcasting news video , 2005, MULTIMEDIA '05.

[7]  Ricky Houghton Named Faces: Putting Names to Faces , 1999, IEEE Intell. Syst..

[8]  Tao Wang,et al.  Cast indexing for videos by NCuts and page ranking , 2007, CIVR '07.

[9]  Roberto Cipolla,et al.  Automatic Cast Listing in Feature-Length Films with Anisotropic Manifold Space , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Hanqing Lu,et al.  Robust Speaking Face Identification for Video Analysis , 2007, PCM.

[11]  Cordelia Schmid,et al.  Automatic face naming with caption-based supervision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  David G. Stork,et al.  Pattern Classification , 1973 .

[14]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[16]  Yuan Li,et al.  Robust Head Tracking with Particles Based on Multiple Cues Fusion , 2006, ECCV Workshop on HCI.

[17]  Zhu Liu,et al.  Major Cast Detection in Video Using Both Speaker and Face Information , 2007, IEEE Transactions on Multimedia.

[18]  Wei-Ta Chu,et al.  RoleNet: treat a movie as a small society , 2007, MIR '07.

[19]  L. R. Rabiner,et al.  A comparative study of several dynamic time-warping algorithms for connected-word recognition , 1981, The Bell System Technical Journal.

[20]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[21]  Pinar Duygulu Sahin,et al.  Finding People Frequently Appearing in News , 2006, CIVR.

[22]  John Scott Social Network Analysis , 1988 .

[23]  Jun Yang,et al.  Finding Person X: Correlating Names with Visual Appearances , 2004, CIVR.

[24]  Tao Mei,et al.  VideoSense: towards effective online video advertising , 2007, ACM Multimedia.

[25]  Jun Yang,et al.  Naming every individual in news video monologues , 2004, MULTIMEDIA '04.

[26]  Andrew McCallum,et al.  People-LDA: Anchoring Topics to People using Face Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27]  Yee Whye Teh,et al.  Names and faces in the news , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[28]  Ben Taskar,et al.  Movie/Script: Alignment and Parsing of Video and Text Transcription , 2008, ECCV.

[29]  Ying Li,et al.  Content-based movie analysis and indexing based on audiovisual cues , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Takeo Kanade,et al.  Name-It: association of face and name in video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[32]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.