Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

Significance This study measures face identification accuracy for an international group of professional forensic facial examiners working under circumstances that apply in real world casework. Examiners and other human face “specialists,” including forensically trained facial reviewers and untrained superrecognizers, were more accurate than the control groups on a challenging test of face identification. Therefore, specialists are the best available human solution to the problem of face identification. We present data comparing state-of-the-art face recognition technology with the best human face identifiers. The best machine performed in the range of the best humans: professional facial examiners. However, optimal face identification was achieved only when humans and machines worked in collaboration. Achieving the upper limits of face identification accuracy in forensic applications can minimize errors that have profound social and personal consequences. Although forensic examiners identify faces in these applications, systematic tests of their accuracy are rare. How can we achieve the most accurate face identification: using people and/or machines working alone or in collaboration? In a comprehensive comparison of face identification by humans and computers, we found that forensic facial examiners, facial reviewers, and superrecognizers were more accurate than fingerprint examiners and students on a challenging face identification test. Individual performance on the test varied widely. On the same test, four deep convolutional neural networks (DCNNs), developed between 2015 and 2017, identified faces within the range of human accuracy. Accuracy of the algorithms increased steadily over time, with the most recent DCNN scoring above the median of the forensic facial examiners. Using crowd-sourcing methods, we fused the judgments of multiple forensic facial examiners by averaging their rating-based identity judgments. Accuracy was substantially better for fused judgments than for individuals working alone. Fusion also served to stabilize performance, boosting the scores of lower-performing individuals and decreasing variability. Single forensic facial examiners fused with the best algorithm were more accurate than the combination of two examiners. Therefore, collaboration among humans and between humans and machines offers tangible benefits to face identification accuracy in important applications. These results offer an evidence-based roadmap for achieving the most accurate face identification possible.

[1]  A. Burton,et al.  Unfamiliar face matching: Pairs out-perform individuals and provide a route to training. , 2015, British journal of psychology.

[2]  N. Cherrington,et al.  Localization of nucleoside transporters in rat epididymis , 2017, Journal of biochemical and molecular toxicology.

[3]  Stephen H Wright,et al.  Localization of Multidrug Resistance-Associated Proteins along the Blood-Testis Barrier in Rat, Macaque, and Human Testis , 2014, Drug Metabolism and Disposition.

[4]  David White,et al.  Error Rates in Users of Automatic Face Recognition Software , 2015, PloS one.

[5]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  A. Burton,et al.  Crowd Effects in Unfamiliar Face Matching , 2013 .

[7]  Stephen H Wright,et al.  Basolateral Uptake of Nucleosides by Sertoli Cells Is Mediated Primarily by Equilibrative Nucleoside Transporter 1 , 2013, The Journal of Pharmacology and Experimental Therapeutics.

[8]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Alice J. O'Toole,et al.  Fusing Face-Verification Algorithms and Humans , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[10]  Vaidehi S. Natu,et al.  Unaware Person Recognition From the Body When Face Identification Fails , 2013, Psychological science.

[11]  Alice J. O'Toole,et al.  Person recognition: Qualitative differences in how forensic face examiners and untrained people rely on the face versus the body for identification , 2017 .

[12]  Alice J. O'Toole,et al.  What is a super-recogniser? , 2017 .

[13]  Alice J. O'Toole,et al.  Face Recognition Algorithms surpass humans matching faces across changes in illumination | NIST , 2007 .

[14]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[15]  Alice J. O'Toole,et al.  Human Factors in Forensic Face Identification , 2017, Handbook of Biometrics for Forensic Science.

[16]  Rama Chellappa,et al.  Unconstrained face verification using deep CNN features , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[17]  Alice J. O'Toole,et al.  FRVT 2006 and ICE 2006 Large-Scale Experimental Results , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Carina A. Hahn,et al.  Wisdom of the social versus non‐social crowd in face identification , 2018, British journal of psychology.

[19]  Rob Jenkins,et al.  Face Recognition by Metropolitan Police Super-Recognisers , 2016, PloS one.

[20]  Klas Brorsson Läthén,et al.  The Effect of Image Quality and Forensic Expertise in Facial Image Comparisons , 2015, Journal of forensic sciences.

[21]  N. Cherrington,et al.  Organic and inorganic transporters of the testis: A review , 2014, Spermatogenesis.

[22]  Alice J. O'Toole,et al.  Comparing face recognition algorithms to humans on challenging tasks , 2012, TAP.

[23]  Carlos D. Castillo,et al.  L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.

[24]  Carlos D. Castillo,et al.  An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[25]  Josh P. Davis,et al.  Investigating predictors of superior face recognition ability in police super-recognisers , 2016 .

[26]  Alice J. O'Toole,et al.  Comparison of human and computer performance across face recognition experiments , 2014, Image and Vision Computing.

[27]  A. O'Toole,et al.  Fusing Face Recognition Algorithms and Humans , 2022 .

[28]  Bruce A. Draper,et al.  An introduction to the good, the bad, & the ugly face recognition challenge problem , 2011, Face and Gesture 2011.

[29]  P. Jonathon Phillips,et al.  A Cross Benchmark Assessment of a Deep Convolutional Neural Network for Face Recognition , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[30]  Stephen H Wright,et al.  Xenobiotic transporter expression along the male genital tract. , 2014, Reproductive toxicology.

[31]  Law. Policy Executive Summary of the National Academies of Science Reports, Strengthening Forensic Science in the United States: A Path Forward , 2009 .

[32]  Alice J. O'Toole,et al.  Face Recognition Algorithms Surpass Humans Matching Faces Over Changes in Illumination , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Matthew Q. Hill,et al.  Perceptual expertise in forensic facial image comparison , 2015, Proceedings of the Royal Society B: Biological Sciences.

[34]  K. Nakayama,et al.  Super-recognizers: People with extraordinary face recognition ability , 2009, Psychonomic bulletin & review.

[35]  Alice J. O'Toole,et al.  FRVT 2006 and ICE 2006 large-scale results , 2007 .