Experiments on Visual Information Extraction with the Faces of Wikipedia

We present a series of visual information extraction experiments using the Faces of Wikipedia database - a new resource that we release into the public domain for both recognition and extraction research containing over 50,000 identities and 60,000 disambiguated images of faces. We compare different techniques for automatically extracting the faces corresponding to the subject of a Wikipedia biography within the images appearing on the page. Our top performing approach is based on probabilistic graphical models and uses the text of Wikipedia pages, similarities of faces as well as various other features of the document, meta-data and image files. Our method resolves the problem jointly for all detected faces on a page. While our experiments focus on extracting faces from Wikipedia biographies, our approach is easily adapted to other types of documents and multiple documents. We focus on Wikipedia because the content is a Creative Commons resource and we provide our database to the community including registered faces, hand labeled and automated disambiguations, processed captions, meta data and evaluation protocols. Our best probabilistic extraction pipeline yields an expected average accuracy of 77% compared to image only and text only baselines which yield 63% and 66% respectively.

[1]  Dragomir Anguelov,et al.  Contextual Identity Recognition in Personal Photo Albums , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[3]  Li Bai,et al.  Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[4]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[6]  Shree K. Nayar,et al.  FaceTracer: A Search Engine for Large Collections of Images with Faces , 2008, ECCV.

[7]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Yuxiao Hu,et al.  Efficient propagation for face annotation in family albums , 2004, MULTIMEDIA '04.

[9]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[10]  Tamara L. Berg,et al.  names and faces. , 1982, The Physician and sportsmedicine.

[11]  Cordelia Schmid,et al.  Face recognition from caption-based supervision , 2010 .

[12]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[13]  Pietro Perona,et al.  Pruning training sets for learning of object categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Christopher Joseph Pal,et al.  Improving alignment of faces for recognition , 2011, 2011 IEEE International Symposium on Robotic and Sensors Environments (ROSE).

[15]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Yee Whye Teh,et al.  Names and faces in the news , 2004, CVPR 2004.

[17]  Aniket Kittur,et al.  What's in Wikipedia?: mapping topics and conflict using socially annotated category structure , 2009, CHI.

[18]  Yongkang Wong,et al.  Patch-based probabilistic image quality assessment for face selection and improved video-based face recognition , 2011, CVPR 2011 WORKSHOPS.

[19]  Matti Pietikäinen,et al.  A Generalized Local Binary Pattern Operator for Multiresolution Gray Scale and Rotation Invariant Texture Classification , 2001, ICAPR.