Cross-modality based celebrity face naming for news image collections

For automatically mining the underlying relationships between different famous persons in daily news, for example, building a news person based network with the faces as icons to facilitate face-based person finding, we need a tool to automatically label faces in new images with their real names. This paper studies the problem of linking names with faces from large-scale news images with captions. In our previous work, we proposed a method called Person-based Subset Clustering which is mainly based on face clustering for all face images derived from the same name. The location where a name appears in a caption, as well as the visual structural information within a news image provided informative cues such as who are really in the associated image. By combining the domain knowledge from the captions and the corresponding image we propose a novel cross-modality approach to further improve the performance of linking names with faces. The experiments are performed on the data sets including approximately half a million news images from Yahoo! news, and the results show that the proposed method achieves significant improvement over the clustering-only methods.

[1]  Jun Yang,et al.  Finding Person X: Correlating Names with Visual Appearances , 2004, CIVR.

[2]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[3]  Thomas Mensink,et al.  Improving People Search Using Query Expansions , 2008, ECCV.

[4]  Pinar Duygulu Sahin,et al.  Interesting faces: A graph-based approach for finding people in news , 2010, Pattern Recognit..

[5]  Cordelia Schmid,et al.  Automatic face naming with caption-based supervision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Marie-Francine Moens,et al.  Naming People in News Videos with Label Propagation , 2011, IEEE MultiMedia.

[7]  Jianping Fan,et al.  Linking names and faces by person-based subset clustering , 2011, ICIMCS '11.

[8]  Thomas Mensink,et al.  Improving People Search Using Query Expansions , 2008, ECCV.

[9]  John Wright,et al.  RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Cordelia Schmid,et al.  Multiple Instance Metric Learning from Automatically Labeled Bags of Faces , 2010, ECCV.

[11]  F. Quimby What's in a picture? , 1993, Laboratory animal science.

[12]  Marie-Francine Moens,et al.  Cross-Media Alignment of Names and Faces , 2010, IEEE Transactions on Multimedia.

[13]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[14]  Ronald Poppe,et al.  Facing scalability: Naming faces in an online social network , 2012, Pattern Recognit..

[15]  Kalina Bontcheva,et al.  GATE: an Architecture for Development of Robust HLT applications , 2002, ACL.

[16]  David A. Forsyth,et al.  Whos In the Picture , 2004, NIPS.

[17]  Yee Whye Teh,et al.  Names and faces in the news , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  Duy-Dinh Le,et al.  Unsupervised Face Annotation by Mining the Web , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[19]  Marie-Francine Moens,et al.  Naming persons in news video with label propagation , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[20]  Wen Gao,et al.  Local Gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.