论文信息 - Deep convolutional neural networks in the face of caricature

Deep convolutional neural networks in the face of caricature

Real-world face recognition requires us to perceive the uniqueness of a face across variable images. Deep convolutional neural networks (DCNNs) accomplish this feat by generating robust face representations that can be analysed in a multidimensional ‘face space’. We examined the organization of viewpoint, illumination, gender and identity in this space. We found that DCNNs create a highly organized face similarity structure in which identities and images coexist. Natural image variation is organized hierarchically, with face identity nested under gender, and illumination and viewpoint nested under identity. To examine identity, we caricatured faces and found that identification accuracy increased with the strength of identity information in a face, and caricature representations ‘resembled’ their veridical counterparts—mimicking human perception. DCNNs therefore offer a theoretical framework for reconciling decades of behavioural and neural results that emphasized either the image or the face in representations, without understanding how a neural code could seamlessly accommodate both.Human face recognition is robust to changes in viewpoint, illumination, facial expression and appearance. The authors investigated face recognition in deep convolutional neural networks by manipulating the strength of identity information in a face by caricaturing. They found that networks create a highly organized face similarity structure in which identities and images coexist.

[1] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Thomas Serre,et al. How Deep is the Feature Analysis underlying Rapid Visual Categorization? , 2016, NIPS.

[3] Carlos D. Castillo,et al. An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[4] Matti Pietikäinen,et al. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5] Kendrick N. Kay,et al. Attention Reduces Spatial Uncertainty in Human Ventral Temporal Cortex , 2015, Current Biology.

[6] Brittany S. Cassidy,et al. Lower-Level Stimulus Features Strongly Influence Responses in the Fusiform Face Area , 2010, Cerebral cortex.

[7] Martin Wattenberg,et al. How to Use t-SNE Effectively , 2016 .

[8] G. Rhodes. Superportraits: Caricatures and Recognition , 1996 .

[9] J. DiCarlo,et al. Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[10] Connor J. Parde,et al. Face and Image Representation in Deep CNN Features , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[11] Laurens van der Maaten,et al. Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[12] D. Perrett,et al. Perception and recognition of photographic quality facial caricatures: Implications for the recognition of natural images , 1991 .

[13] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[14] A. O'Toole,et al. Structural aspects of face recognition and the other-race effect , 1994, Memory & cognition.

[15] Kay L Ritchie,et al. Learning faces from variability , 2017, Quarterly journal of experimental psychology.

[16] T. Poggio,et al. A network that learns to recognize three-dimensional objects , 1990, Nature.

[17] Thomas Vetter,et al. A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] S. Edelman,et al. Differential Processing of Objects under Various Viewing Conditions in the Human Lateral Occipital Complex , 1999, Neuron.

[20] Alice J. O'Toole,et al. Learning context and the other-race effect: Strategies for improving face recognition , 2019, Vision Research.

[21] T. Valentine. The Quarterly Journal of Experimental Psychology Section A: Human Experimental Psychology a Unified Account of the Effects of Distinctiveness, Inversion, and Race in Face Recognition , 2022 .

[22] Yu Qiao,et al. A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[23] A. Mike Burton,et al. Face learning with multiple images leads to fast acquisition of familiarity for specific individuals , 2016, Quarterly journal of experimental psychology.

[24] K. Grill-Spector,et al. The functional architecture of the ventral temporal cortex and its role in categorization , 2014, Nature Reviews Neuroscience.

[25] Alice J. O'Toole,et al. Dissociable Neural Patterns of Facial Identity across Changes in Viewpoint , 2010, Journal of Cognitive Neuroscience.

[26] Jun-Cheng Chen,et al. An End-to-End System for Unconstrained Face Verification with Deep Convolutional Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[27] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[28] M. Kilwein,et al. Basic objects in natural categories revisited : a replication with sighted and blind college students / , 1993 .

[29] David Marr,et al. VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[30] T. Poggio,et al. Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[31] Carlos D. Castillo,et al. A Fast and Accurate System for Face Detection, Identification, and Verification , 2018, IEEE Transactions on Biometrics, Behavior, and Identity Science.

[32] Susan E. Brennan,et al. From the Leonardo Archive , 2007, Leonardo.

[33] Roberto Brunelli,et al. Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[34] Alice J. O'Toole,et al. A physical system approach to recognition memory for spatially transformed faces , 1988, Neural Networks.

[35] Tomaso Poggio,et al. Models of object recognition , 2000, Nature Neuroscience.

[36] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[37] D. Perrett,et al. Visual Processing of Facial Distinctiveness , 1994, Perception.

[38] Marlene Behrmann,et al. Feature-based face representations and image reconstruction from behavioral and neural data , 2015, Proceedings of the National Academy of Sciences.

[39] G. Rhodes,et al. Identification and ratings of caricatures: Implications for mental representations of faces , 1987, Cognitive Psychology.

[40] G. Rhodes,et al. Facial Distinctiveness and the Power of Caricatures , 1997, Perception.

[41] Alice J. O'Toole,et al. Low-dimensional representation of faces in higher dimensions of the face space , 1993 .

[42] Connor J. Parde,et al. Face Space Representations in Deep Convolutional Neural Networks , 2018, Trends in Cognitive Sciences.

[43] Yuxiao Hu,et al. MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[44] David D. Cox,et al. Untangling invariant object recognition , 2007, Trends in Cognitive Sciences.

[45] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[46] H. Bülthoff,et al. Face recognition under varying poses: The role of texture and shape , 1996, Vision Research.

[47] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[48] Carlos D. Castillo,et al. The Do’s and Don’ts for CNN-Based Face Verification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[49] Kunihiko Fukushima,et al. Neocognitron: A hierarchical neural network capable of visual pattern recognition , 1988, Neural Networks.

[50] Gillian Rhodes,et al. Recognition of own-race and other-race caricatures: implications for models of face recognition , 1998, Vision Research.

[51] Carlos D. Castillo,et al. Triplet probabilistic embedding for face verification and clustering , 2016, 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[52] H H Bülthoff,et al. Psychophysical support for a two-dimensional view interpolation theory of object recognition. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[53] Rama Chellappa,et al. Unconstrained face verification using deep CNN features , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[54] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[55] Carlos D. Castillo,et al. UMDFaces: An annotated face dataset for training deep networks , 2016, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[56] I. Biederman. Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[57] A. O'Toole,et al. Prototype-referenced shape encoding revealed by high-level aftereffects , 2001, Nature Neuroscience.

[58] Ha Hong,et al. Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.

[59] G. Rhodes,et al. Caricature Effects, Distinctiveness, and Identification: Testing the Face-Space Framework , 2000, Psychological science.

[60] Carlos D. Castillo,et al. Crystal Loss and Quality Pooling for Unconstrained Face Verification and Recognition , 2018, ArXiv.

[61] A. Yuille. Deformable Templates for Face Recognition , 1991, Journal of Cognitive Neuroscience.

[62] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[63] Randolph Blake,et al. The Occipital Face Area Is Causally Involved in Facial Viewpoint Perception , 2015, The Journal of Neuroscience.

[64] Alex Pentland,et al. Bayesian face recognition , 2000, Pattern Recognit..

[65] Xiaogang Wang,et al. Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.