Annotating Unconstrained Face Imagery: A scalable approach

As unconstrained face recognition datasets progress from containing faces that can be automatically detected by commodity face detectors to face imagery with full pose variations that must instead be manually localized, a significant amount of annotation effort is required for developing benchmark datasets. In this work we describe a systematic approach for annotating fully unconstrained face imagery using crowdsourced labor. For such data preparation, a cascade of crowdsourced tasks are performed, which begins with bounding box annotations on all faces contained in images and videos, followed by identification of the labelled person of interest in such imagery, and, finally, landmark annotation of key facial fiducial points. In order to allow such annotations to scale to large volumes of imagery, a software system architecture is provided which achieves a sustained rate of 30,000 annotations per hour (or 500 manual annotations per minute). While previous crowdsourcing guidance described in the literature generally involved multiple choice questions or text input, our tasks required annotators to provide geometric primitives (rectangles and points) in images. As such, algorithms are provided for combining multiple annotations of an image into a single result, and automatically measuring the quality of a given annotation. Finally, other guidance is provided for improving the accuracy and scalability of crowdsourced image annotation for face detection and recognition.

[1]  Will Fitzgerald,et al.  A Hybrid Model for Annotating Named Entity Training Corpora , 2010, Linguistic Annotation Workshop.

[2]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[3]  Arjen P. de Vries,et al.  Obtaining High-Quality Relevance Judgments Using Crowdsourcing , 2012, IEEE Internet Computing.

[4]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[5]  Kwong-Sak Leung,et al.  A Survey of Crowdsourcing Systems , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[6]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[8]  Jeffrey Heer,et al.  Crowdsourcing graphical perception: using mechanical turk to assess visualization design , 2010, CHI.

[9]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[10]  Manuel Blum,et al.  Peekaboom: a game for locating objects in images , 2006, CHI.

[11]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[12]  Shree K. Nayar,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Describable Visual Attributes for Face Verification and Image Search , 2022 .

[13]  Xiao Zhang,et al.  Finding Celebrities in Billions of Web Images , 2012, IEEE Transactions on Multimedia.

[14]  Walter J. Scheirer,et al.  Perceptual Annotation: Measuring Human Vision to Improve Computer Vision , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Erik Learned-Miller,et al.  FDDB: A benchmark for face detection in unconstrained settings , 2010 .

[16]  Lakshminarayanan Subramanian,et al.  Reputation-based Worker Filtering in Crowdsourcing , 2014, NIPS.

[17]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[18]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[19]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.