Explaining face representation in the primate brain using different computational models

Understanding how the brain represents the identity of complex objects is a central challenge of visual neuroscience. The principles governing object processing have been extensively studied in the macaque face patch system, a sub-network of inferotemporal (IT) cortex specialized for face processing. A previous study reported that single face patch neurons encode axes of a generative model called the "active appearance" model, which transforms 50D feature vectors separately representing facial shape and facial texture into facial images. However, a systematic investigation comparing this model to other computational models, especially convolutional neural network models that have shown success in explaining neural responses in the ventral visual stream, has been lacking. Here, we recorded responses of cells in the most anterior face patch anterior medial (AM) to a large set of real face images and compared a large number of models for explaining neural responses. We found that the active appearance model better explained responses than any other model except CORnet-Z, a feedforward deep neural network trained on general object classification to classify non-face images, whose performance it tied on some face image sets and exceeded on others. Surprisingly, deep neural networks trained specifically on facial identification did not explain neural responses well. A major reason is that units in the network, unlike neurons, are less modulated by face-related factors unrelated to facial identification, such as illumination.

[1]  Aran Nayebi,et al.  CORnet: Modeling the Neural Mechanisms of Core Object Recognition , 2018, bioRxiv.

[2]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[3]  Wen Gao,et al.  The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[4]  Bernhard Egger,et al.  Occlusion-Aware 3D Morphable Models and an Illumination Prior for Face Image Analysis , 2018, International Journal of Computer Vision.

[5]  Demis Hassabis,et al.  Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons , 2020, Nature Communications.

[6]  Nikolaus Kriegeskorte,et al.  Representational Similarity Analysis – Connecting the Branches of Systems Neuroscience , 2008, Frontiers in systems neuroscience.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Kurt Gray,et al.  The MR2: A multi-racial, mega-resolution database of facial stimuli , 2016, Behavior research methods.

[9]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Daniel L. K. Yamins,et al.  Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..

[12]  Alexander Lerchner,et al.  A Heuristic for Unsupervised Model Selection for Variational Disentangled Representation Learning , 2019, ICLR.

[13]  Adam Santoro,et al.  Backpropagation and the brain , 2020, Nature Reviews Neuroscience.

[14]  Doris Y. Tsao,et al.  Functional Compartmentalization and Viewpoint Generalization Within the Macaque Face-Processing System , 2010, Science.

[15]  V. S. Ramachandran,et al.  Perception of shape from shading , 1988, Nature.

[16]  Max Tegmark,et al.  Why Does Deep and Cheap Learning Work So Well? , 2016, Journal of Statistical Physics.

[17]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[18]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[19]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[20]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[21]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[22]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[23]  Fred L. Bookstein,et al.  Principal Warps: Thin-Plate Splines and the Decomposition of Deformations , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Joshua Correll,et al.  The Chicago face database: A free stimulus set of faces and norming data , 2015, Behavior research methods.

[25]  Nikolaus Kriegeskorte,et al.  Deep Neural Networks in Computational Neuroscience , 2019 .

[26]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[27]  Josh Tenenbaum,et al.  Efficient inverse graphics in biological face processing , 2020, Science Advances.

[28]  Bernhard Egger,et al.  Morphable Face Models - An Open Framework , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[29]  Doris Y. Tsao,et al.  A Cortical Region Consisting Entirely of Face-Selective Cells , 2006, Science.

[30]  Doris Y. Tsao,et al.  What Makes a Cell Face Selective? The Importance of Contrast , 2012, Neuron.

[31]  Doris Y. Tsao,et al.  The Code for Facial Identity in the Primate Brain , 2017, Cell.

[32]  Joel Z. Leibo,et al.  View-Tolerant Face Recognition and Hebbian Learning Imply Mirror-Symmetric Neural Tuning to Head Orientation , 2016, Current Biology.

[33]  L Sirovich,et al.  Low-dimensional Procedure for the Characterization of Human Faces , 1986 .

[34]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[35]  Rajani Raman,et al.  Convolutional neural networks explain tuning properties of anterior, but not middle, face-processing areas in macaque inferotemporal cortex , 2020, Communications Biology.

[36]  Jonas Kubilius,et al.  Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? , 2018, bioRxiv.

[37]  Rufin Vogels,et al.  Shape Selectivity of Middle Superior Temporal Sulcus Body Patch Neurons , 2017, eNeuro.

[38]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Bernhard Egger,et al.  Markov Chain Monte Carlo for Automated Face Image Analysis , 2017, International Journal of Computer Vision.

[40]  Michael I. Jordan,et al.  A Bayesian Perspective of Convolutional Neural Networks through a Deconvolutional Generative Model , 2018, 1811.02657.

[41]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Ha Hong,et al.  Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.

[43]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[44]  Timothy F. Cootes,et al.  Interpreting face images using active appearance models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[45]  Naokazu Goda,et al.  Perceptual Gloss Parameters Are Encoded by Population Responses in the Monkey Inferior Temporal Cortex , 2014, The Journal of Neuroscience.


[47]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[48]  A. Young,et al.  Are We Face Experts? , 2018, Trends in Cognitive Sciences.