Low-Level Cues and Ultra-Fast Face Detection

Recent experimental work has demonstrated the existence of extremely rapid saccades toward faces in natural scenes that can be initiated only 100 ms after image onset (Crouzet et al., 2010). These ultra-rapid saccades constitute a major challenge to current models of processing in the visual system because they do not seem to leave enough time for even a single feed-forward pass through the ventral stream. Here we explore the possibility that the information required to trigger these very fast saccades could be extracted very early on in visual processing using relatively low-level amplitude spectrum (AS) information in the Fourier domain. Experiment 1 showed that AS normalization can significantly alter face-detection performance. However, a decrease of performance following AS normalization does not alone prove that AS-based information is used (Gaspar and Rousselet, 2009). In Experiment 2, following the Gaspar and Rousselet paper, we used a swapping procedure to clarify the role of AS information in fast object detection. Our experiment is composed of three conditions: (i) original images, (ii) category swapped, in which the face image has the AS of a vehicle, and the vehicle has the AS of a face, and (iii) identity swapped, where the face has the AS of another face image, and the vehicle has the AS of another vehicle image. The results showed very similar levels of performance in the original and identity swapped conditions, and a clear drop in the category swapped condition. This result demonstrates that, in the early temporal window offered by the saccadic choice task, the visual saccadic system does indeed rely on low-level AS information in order to rapidly detect faces. This sort of crude diagnostic information could potentially be derived very early on in the visual system, possibly as early as V1 and V2.

[1]  Doris Y. Tsao,et al.  A Cortical Region Consisting Entirely of Face-Selective Cells , 2006, Science.

[2]  H R Wilson,et al.  Factors limiting peripheral pattern discrimination. , 1999, Spatial vision.

[3]  R. Watt,et al.  Biological "bar codes" in human faces. , 2009, Journal of vision.

[4]  Matthias S. Keil,et al.  “I Look in Your Eyes, Honey”: Internal Face Features Induce Spatial Frequency Preference for Human Face Processing , 2009, PLoS Comput. Biol..

[5]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[6]  R. VanRullen On second glance: Still no high-level pop-out effect for faces , 2006, Vision Research.

[7]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[8]  S. Hochstein,et al.  With a careful look: Still no low-level confound to face pop-out , 2006, Vision Research.

[9]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[10]  M. Keil Does face image statistics predict a preferred spatial frequency for human face processing? , 2008, Proceedings of the Royal Society B: Biological Sciences.

[11]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[12]  G. Rousselet,et al.  How do amplitude spectra influence rapid animal detection? , 2009, Vision Research.

[13]  Lester C. Loschky,et al.  Localized information is necessary for scene categorization, including the Natural/Man-made distinction. , 2008, Journal of vision.

[14]  G Westheimer,et al.  The Fourier Theory of Vision , 2001, Perception.

[15]  Harvey A Swadlow,et al.  The Impact of a Corticotectal Impulse on the Awake Superior Colliculus , 2006, The Journal of Neuroscience.

[16]  D G Pelli,et al.  The VideoToolbox software for visual psychophysics: transforming numbers into movies. , 1997, Spatial vision.

[17]  David C Lyon,et al.  Distribution across cortical areas of neurons projecting to the superior colliculus in new world monkeys. , 2005, The anatomical record. Part A, Discoveries in molecular, cellular, and evolutionary biology.

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  J. Robson,et al.  Application of fourier analysis to the visibility of gratings , 1968, The Journal of physiology.

[20]  L N Piotrowski,et al.  A Demonstration of the Visual Importance and Flexibility of Spatial-Frequency Amplitude and Phase , 1982, Perception.

[21]  D. Braun,et al.  Phase noise and the classification of natural images , 2006, Vision Research.

[22]  V. Lamme,et al.  The distinct modes of vision offered by feedforward and recurrent processing , 2000, Trends in Neurosciences.

[23]  B. Rossion,et al.  ERP evidence for the speed of face categorization in the human brain: Disentangling the contribution of low-level visual cues from face perception , 2011, Vision Research.

[24]  Steven C Dakin,et al.  Positional averaging explains crowding with letter-like stimuli , 2009, Proceedings of the National Academy of Sciences.

[25]  Arnaud Delorme,et al.  Face identification using one spike per neuron: resistance to image degradations , 2001, Neural Networks.

[26]  D. B. Bender,et al.  Distribution of corticotectal cells in macaque , 2003, Experimental Brain Research.

[27]  David J. Field,et al.  Wavelets, vision and the statistics of natural scenes , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[28]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[29]  G. Kreiman,et al.  Timing, Timing, Timing: Fast Decoding of Object Information from Intracranial Field Potentials in Human Visual Cortex , 2009, Neuron.

[30]  Jan Drewes,et al.  Animal detection in natural scenes: critical features revisited. , 2010, Journal of vision.

[31]  安藤 広志,et al.  20世紀の名著名論:David Marr:Vision:a Computational Investigation into the Human Representation and Processing of Visual Information , 2005 .

[32]  Lester C. Loschky,et al.  The importance of information localization in scene gist recognition. , 2007, Journal of experimental psychology. Human perception and performance.

[33]  Nathalie Guyader,et al.  Image phase or amplitude? Rapid scene categorization is an amplitude-based process. , 2004, Comptes rendus biologies.

[34]  David Masip,et al.  Preferred Spatial Frequencies for Human Face Processing Are Associated with Optimal Class Discrimination in the Machine , 2008, PloS one.

[35]  Simon J. Thorpe,et al.  Ultra-rapid object detection with saccadic eye movements: Visual processing speed revisited , 2006, Vision Research.

[36]  Sébastien M. Crouzet,et al.  Fast saccades toward faces: face detection in just 100 ms. , 2010, Journal of vision.

[37]  Guillaume A. Rousselet,et al.  Rapid visual categorization of natural scene contexts with equalized amplitude spectrum and increasing phase noise. , 2009, Journal of vision.

[38]  A.V. Oppenheim,et al.  The importance of phase in signals , 1980, Proceedings of the IEEE.

[39]  J. Smeets,et al.  Nature of variability in saccades. , 2003, Journal of neurophysiology.

[40]  R. VanRullen,et al.  Faces in the cloud: Fourier power spectrum biases ultrarapid face detection. , 2008, Journal of vision.

[41]  D G Pelli,et al.  Uncertainty explains many aspects of visual contrast detection and discrimination. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[42]  Michael J. Tarr,et al.  Task-Specific Codes for Face Recognition: How they Shape the Neural Representation of Features for Detection and Individuation , 2008, PloS one.

[43]  Mark H. Johnson Subcortical face processing , 2005, Nature Reviews Neuroscience.

[44]  Timothée Masquelier,et al.  Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity , 2007, PLoS Comput. Biol..

[45]  D. Levi Crowding—An essential bottleneck for object recognition: A mini-review , 2008, Vision Research.

[46]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[47]  Stefan Treue,et al.  Adaptation to statistical properties of visual scenes biases rapid categorization , 2007 .

[48]  R Van Rullen,et al.  Face processing using one spike per neurone. , 1998, Bio Systems.

[49]  M. Potter Meaning in visual search. , 1975, Science.