Naming faces in broadcast news video by image google

Naming faces is important for news videos browsing and indexing. Although some research efforts have been contributed to it, they only use the concurrent information between the face and name or employ some clues as features and use simple heuristic method or machine learning approach to finish the task. They use little extra knowledge about the names and faces. Different from previous work, in this paper we present a novel approach to name the faces by exploring extra knowledge obtained from image google. The behind assumption is that the faces of those important persons will turn out many times in the web images and could be retrieved from image google easily. Firstly, faces are detected in the video frames; and the name entities of candidate persons are extracted from the textual information by automatic speech recognition and close caption detection. Then, these candidate person names are used as queries to find the name related person images through image google. After that, the retrieved result is analyzed and some typical faces are selected through feature density estimation. Finally, the detected faces in the news video are matched with the faces selected from the result returned by image google to label each face. Experimental results on MSNBC news and CNN news demonstrate that the proposed approach is effective.

[1]  Qi Tian,et al.  A Two-Level Multi-Modal Approach for Story Segmentation of Large News Video Corpus , 2003, TRECVID.

[2]  Wen Gao,et al.  Hierarchical Ensemble of Global and Local Classifiers for Face Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Qingming Huang,et al.  A New Text Detection Algorithm in Images/Video Frames , 2004, PCM.

[4]  Takeo Kanade,et al.  Name-It: association of face and name in video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Pinar Duygulu Sahin,et al.  A Graph Based Approach for Naming Faces in News Photos , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Rong Yan,et al.  Multiple instance learning for labeling faces in broadcasting news video , 2005, MULTIMEDIA '05.

[7]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[8]  Yee Whye Teh,et al.  Names and faces in the news , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  Jun Yang,et al.  Naming every individual in news video monologues , 2004, MULTIMEDIA '04.

[10]  Wen Gao,et al.  Expand training set for face detection by GA re-sampling , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..