Biologically-Inspired Face Detection: Non-Brute-Force-Search Approach

We present a biologically-inspired face detection system. The system applies notions such as saliency, gist, and gaze to localize a face without performing blind spatial search. The saliency model consists of highly parallel low-level computations that operate in domains such as intensity, orientation, and color. It is used to direct attention to a set of conspicuous locations in an image as starting points. The gist model, computed in parallel with the saliency model, estimates holistic image characteristics such as dominant contours and magnitude in high and low spatial frequency bands. We are limiting its use to predicting the likely head size based on the entire scene. Also, instead of identifying face as a single entity, this system performs detection by parts and uses spatial configuration constraints to be robust against occlusion and perspective.

[1]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[2]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .

[3]  Christoph von der Malsburg,et al.  Recognizing Faces by Dynamic Link Matching , 1996, NeuroImage.

[4]  Norbert Krüger,et al.  Face recognition by elastic bunch graph matching , 1997, Proceedings of International Conference on Image Processing.

[5]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8]  Takeo Kanade,et al.  Rotation invariant neural network-based face detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[9]  I. Rybak,et al.  A model of attention-guided visual perception and recognition , 1998, Vision Research.

[10]  E. Hartmann Perceptual Development: Visual, Auditory, and Speech Perception in Infancy , 1999 .

[11]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[12]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[13]  C. Koch,et al.  Models of bottom-up and top-down visual attention , 2000 .

[14]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[15]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[16]  Laurent Itti,et al.  Neuromorphic algorithms for computer vision and attention , 2001, SPIE Optics + Photonics.

[17]  Antonio Torralba,et al.  Statistical Context Priming for Object Detection , 2001, ICCV.

[18]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[19]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[20]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[21]  Antonio Torralba,et al.  Modeling global scene factors in attention. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[22]  Masakazu Matsugu,et al.  Subject independent facial expression recognition with robust face detection using a convolutional neural network , 2003, Neural Networks.

[23]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[24]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[25]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[26]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[27]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.