Human detection using a mobile platform and novel features derived from a visual saliency mechanism

Human detection is a key ability to an increasing number of applications that operates in human inhabited environments or needs to interact with a human user. Currently, most successful approaches to human detection are based on background substraction techniques that apply only to the case of static cameras or cameras with highly constrained motions. Furthermore, many applications rely on features derived from specific human poses, such as systems based on features derived from the human face which is only visible when a person is facing the detecting camera. In this work, we present a new computer vision algorithm designed to operate with moving cameras and to detect humans in different poses under partial or complete view of the human body. We follow a standard pattern recognition approach based on four main steps: (i) preprocessing to achieve color constancy and stereo pair calibration, (ii) segmentation using depth continuity information, (iii) feature extraction based on visual saliency, and (iv) classification using a neural network. The main novelty of our approach lies in the feature extraction step, where we propose novel features derived from a visual saliency mechanism. In contrast to previous works, we do not use a pyramidal decomposition to run the saliency algorithm, but we implement this at the original image resolution using the so-called integral image. Our results indicate that our method: (i) outperforms state-of-the-art techniques for human detection based on face detectors, (ii) outperforms state-of-the-art techniques for complete human body detection based on different set of visual features, and (iii) operates in real time onboard a mobile platform, such as a mobile robot (15fps).

[1]  Andrew Zisserman,et al.  Multi-view Matching for Unordered Image Sets, or "How Do I Organize My Holiday Snaps?" , 2002, ECCV.

[2]  Rama Chellappa,et al.  Detection of people in images , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[3]  Geoffrey A. Hollinger,et al.  Design of a Social Mobile Robot Using Emotion-Based Decision Mechanisms , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Huosheng Hu,et al.  Multisensor Integration for Human-Robot Interaction , 2005 .

[5]  H. Barlow Vision Science: Photons to Phenomenology by Stephen E. Palmer , 2000, Trends in Cognitive Sciences.

[6]  Alvaro Soto,et al.  Unsupervised identification of useful visual landmarks using multiple segmentations and top-down feedback , 2008, Robotics Auton. Syst..

[7]  Alvaro Soto,et al.  Human Detection in Indoor Environments Using Multiple Visual Cues and a Mobile Robot , 2007, CIARP.

[8]  Rafael Muñoz-Salinas,et al.  People detection and tracking using stereo vision and color , 2007, Image Vis. Comput..

[9]  Vladimir Vezhnevets,et al.  A Survey on Pixel-Based Skin Color Detection Techniques , 2003 .

[10]  D G Bobrow,et al.  Applications of Artificial Intelligence , 1999 .

[11]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Neeti A. Ogale,et al.  A survey of techniques for human detection from video , 2006 .

[13]  Andrew Zisserman,et al.  An Affine Invariant Salient Region Detector , 2004, ECCV.

[14]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[15]  D.M. Gavrila,et al.  Vision-based pedestrian detection: the PROTECTOR system , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[16]  Sebastian Lang,et al.  Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot , 2003, ICMI '03.

[17]  Simone Frintrop,et al.  A Real-time Visual Attention System Using Integral Images , 2007, ICVS 2007.

[18]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[20]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[21]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[22]  Horst-Michael Groß,et al.  A multi-modal system for tracking and analyzing faces on a mobile robot , 2004, Robotics Auton. Syst..

[23]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[24]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[25]  Shan Fu,et al.  Stereovision-Based Object Segmentation for Automotive Applications , 2005, EURASIP J. Adv. Signal Process..

[26]  Sven Behnke,et al.  Towards a humanoid museum guide robot that interacts with multiple persons , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[27]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[28]  Ben J. A. Kröse,et al.  Jijo-2: An Office Robot that Communicates and Learns , 2001, IEEE Intell. Syst..

[29]  Liang Zhao,et al.  Stereo- and neural network-based pedestrian detection , 1999, Proceedings 199 IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems (Cat. No.99TH8383).

[30]  Wolfram Burgard,et al.  Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..

[31]  Pietro Perona,et al.  Is bottom-up attention useful for object recognition? , 2004, CVPR 2004.

[32]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  A N Rajagopalan,et al.  Higher-order-statistics-based detection of vehicles in still images. , 2001, Journal of the Optical Society of America. A, Optics, image science, and vision.

[34]  David Gerónimo Gómez,et al.  Computer Vision Approaches to Pedestrian Detection: Visible Spectrum Survey , 2007, IbPRIA.

[35]  Pietro Perona,et al.  On the usefulness of attention for object recognition , 2004 .

[36]  Ian Witten,et al.  Data Mining , 2000 .

[37]  A. Fascioli,et al.  Pedestrian Protection Systems : Issues , Survey , and Challenges , 2007 .

[38]  T. Duckett,et al.  VOCUS : A Visual Attention System for Object Detection and Goal-directed Search , 2010 .

[39]  L. Itti The iLab Neuromorphic Vision C + + Toolkit : Free tools for the next generation of vision algorithms , 2022 .

[40]  Maja J Matarić,et al.  Socially Assistive Robotics for Post-stroke Rehabilitation Journal of Neuroengineering and Rehabilitation Socially Assistive Robotics for Post-stroke Rehabilitation , 2007 .