Quantifying the Extent to Which Race and Gender Features Determine Identity in Commercial Face Recognition Algorithms

Human face features can be used to determine individual identity as well as demographic information like gender and race. However, the extent to which black-box commercial face recognition algorithms (CFRAs) use gender and race features to determine identity is poorly understood despite increasing deployments by government and industry. In this study, we quantified the degree to which gender and race features influenced face recognition similarity scores between different people, i.e. non-mated scores. We ran this study using five different CFRAs and a sample of 333 diverse test subjects. As a control, we compared the behavior of these non-mated distributions to a commercial iris recognition algorithm (CIRA). Confirming prior work, all CFRAs produced higher similarity scores for people of the same gender and race, an effect known as "broad homogeneity". No such effect was observed for the CIRA. Next, we applied principal components analysis (PCA) to similarity score matrices. We show that some principal components (PCs) of CFRAs cluster people by gender and race, but the majority do not. Demographic clustering in the PCs accounted for only 10 % of the total CFRA score variance. No clustering was observed for the CIRA. This demonstrates that, although CFRAs use some gender and race features to establish identity, most features utilized by current CFRAs are unrelated to gender and race, similar to the iris texture patterns utilized by the CIRA. Finally, reconstruction of similarity score matrices using only PCs that showed no demographic clustering reduced broad homogeneity effects, but also decreased the separation between mated and non-mated scores. This suggests it's possible for CFRAs to operate on features unrelated to gender and race, albeit with somewhat lower recognition accuracy, but that this is not the current commercial practice.

[1]  John J. Howard,et al.  An Investigation of High-Throughput Biometric Systems: Results of the 2018 Department of Homeland Security Biometric Technology Rally , 2018, 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[2]  S. Weinberg,et al.  Using the 3D Facial Norms Database to investigate craniofacial sexual dimorphism in healthy children, adolescents, and adults , 2016, Biology of Sex Differences.

[3]  Carlos D. Castillo,et al.  Accuracy comparison across face recognition algorithms: Where are we on measuring race bias? , 2019, ArXiv.

[4]  H Stanislaw,et al.  Calculation of signal detection theory measures , 1999, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[5]  Anil K. Jain,et al.  Face Recognition Performance: Role of Demographic Information , 2012, IEEE Transactions on Information Forensics and Security.

[6]  Peter M. Krawitz,et al.  Identifying facial phenotypes of genetic disorders using deep learning , 2019, Nature Medicine.

[7]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[9]  Jihadi terrorists in Europe , 2006 .

[10]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[11]  Arun Ross,et al.  Privacy of Facial Soft Biometrics: Suppressing Gender But Retaining Identity , 2014, ECCV Workshops.

[12]  John J. Howard,et al.  Demographic Effects in Facial Recognition and Their Dependence on Image Acquisition: An Evaluation of Eleven Commercial Systems , 2019, IEEE Transactions on Biometrics, Behavior, and Identity Science.

[13]  Paul Suetens,et al.  Modeling 3D Facial Shape from DNA , 2014, PLoS genetics.

[14]  P. Hammond,et al.  Craniofacial characteristics of fragile X syndrome in mouse and man , 2012, European Journal of Human Genetics.

[15]  Kevin W. Bowyer,et al.  Does Face Recognition Accuracy Get Better With Age? Deep Face Matchers Say No , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[16]  K. Bowyer,et al.  Predicting ethnicity and gender from iris texture , 2011, 2011 IEEE International Conference on Technologies for Homeland Security (HST).

[17]  Denton Bobeldyk,et al.  Predicting Gender and Race from Near Infrared Iris and Periocular Images , 2018, ArXiv.

[18]  John J. Howard,et al.  The Effect of Broad and Specific Demographic Homogeneity on the Imposter Distributions and False Match Rates in Face Recognition Algorithm Performance , 2019, 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[19]  Ashok Samal,et al.  Analysis of sexual dimorphism in human face , 2007, J. Vis. Commun. Image Represent..

[20]  Arun Ross,et al.  PrivacyNet: Semi-Adversarial Networks for Multi-Attribute Face Privacy , 2020, IEEE Transactions on Image Processing.

[21]  Kevin Bowyer,et al.  Characterizing the Variability in Face Recognition Accuracy Relative to Race , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Patrick J. Flynn,et al.  Genetically identical irises have texture similarity that is not detected by iris biometrics , 2011, Comput. Vis. Image Underst..

[23]  Andrey Kuehlkamp,et al.  Predicting Gender From Iris Texture May Be Harder Than It Seems , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24]  Christopher R Forrest,et al.  International Anthropometric Study of Facial Morphology in Various Ethnic Groups/Races , 2005, The Journal of craniofacial surgery.

[25]  C. Areias,et al.  Craniofacial features and specific oral characteristics of Down syndrome children. , 2014, Oral health and dental management.