People detection and tracking using stereo vision and color

People detection and tracking are important capabilities for applications that desire to achieve a natural human-machine interaction. Although the topic has been extensively explored using a single camera, the availability and low price of new commercial stereo cameras makes them an attractive sensor to develop more sophisticated applications that take advantage of depth information. This work presents a system able to visually detect and track multiple people using a stereo camera placed at an under-head position. This camera position is especially appropriated for human-machine applications that require interacting with people or to analyze human facial gestures. The system models the background as height map that is employed to easily extract foreground objects among which people are found using a face detector. Once a person has been spotted, the system is capable of tracking him while is still looking for more people. Our system tracks people combining color and position information (using the Kalman filter). Tracking based exclusively on position information is unreliable when people establish close interactions. Thus, we also include color information about the people clothes in order to increase the tracking robustness. The system has been extensively tested and the results show that the use of color greatly reduces the errors of the tracking system. Besides, the people detection technique employed, based on combining plan-view map information and a face detector, has proved in our experimentation to avoid false detections in the tests performed. Finally, the low computing time required for the detection and tracking process makes it suitable to be employed in real time applications.

[1]  Jake K. Aggarwal,et al.  Stochastic Analysis of Stereo Quantization Error , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  David Beymer,et al.  Ghost/sup 3D/: detecting body posture and parts using stereo , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[3]  Rafael Muñoz-Salinas,et al.  A multi-agent system architecture for mobile robot navigation based on fuzzy and visual behaviour , 2005, Robotica.

[4]  Miguel García-Silvente,et al.  The novel scale-spectrum space for representing gray-level shape , 1997, Pattern Recognit..

[5]  Manabu Hashimoto,et al.  Multiple-person tracker with a fixed slanting stereo camera , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[6]  Wolfram Burgard,et al.  Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..

[7]  Christian Micheloni,et al.  Video security for ambient intelligence , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[8]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[9]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[10]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[11]  B. Schiele,et al.  Fast and Robust Face Finding via Local Context , 2003 .

[12]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[13]  Ramakant Nevatia,et al.  Stereo Error Detection, Correction, and Evaluation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Michael Harville,et al.  Stereo person tracking with adaptive plan-view templates of height and occupancy statistics , 2004, Image Vis. Comput..

[15]  Takeshi Ohashi,et al.  Obstacle avoidance and path planning for humanoid robots using stereo vision , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[16]  Irfan Essa,et al.  Robust Tracking of People by a Mobile Robotic Agent , 1999 .

[17]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[20]  Stan Sclaroff,et al.  Skin color-based video segmentation under time-varying illumination , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Mohinder S. Grewal,et al.  Kalman Filtering: Theory and Practice , 1993 .

[22]  Antonio González Muñoz,et al.  Fuzzy behaviors for mobile robot navigation: design, coordination and fusion , 2000, Int. J. Approx. Reason..

[23]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[24]  James D. Foley,et al.  Fundamentals of interactive computer graphics , 1982 .

[25]  Mark Yim,et al.  Motion planning of legged vehicles in an unstructured environment , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[26]  Michael Harville,et al.  Foreground segmentation using adaptive mixture models in color and depth , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[27]  Robert C. Bolles,et al.  Background modeling for segmentation of video-rate stereo sequences , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[28]  Miguel García-Silvente,et al.  A new edge detector integrating scale-spectrum information , 1997, Image Vis. Comput..

[29]  Trevor Darrell,et al.  Plan-view trajectory estimation with dense stereo background models , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[30]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  V. Leitáo,et al.  Computer Graphics: Principles and Practice , 1995 .

[32]  Neil A. Thacker,et al.  The Bhattacharyya metric as an absolute similarity measure for frequency coded data , 1998, Kybernetika.

[33]  Rainer Stiefelhagen,et al.  Head pose estimation using stereo vision for human-robot interaction , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[34]  Manolis I. A. Lourakis,et al.  Three-dimensional tracking of multiple skin-colored regions by a moving stereoscopic system. , 2004, Applied optics.

[35]  Rainer Stiefelhagen,et al.  3D-tracking of head and hands for pointing gesture recognition in a human-robot interaction scenario , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[36]  Satoshi Kagami,et al.  Stereo vision terrain modeling for non-planar mobile robot mapping and navigation , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[37]  Luc Van Gool,et al.  An adaptive color-based particle filter , 2003, Image Vis. Comput..

[38]  M. Carter Computer graphics: Principles and practice , 1997 .

[39]  Matti Pietikäinen,et al.  Detection of skin color under changing illumination: a comparative study , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[40]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[41]  Reinhard Koch,et al.  Realtime multi-camera person tracking for immersive environments , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[42]  Wright Patterson Stereo Error Detection, Correction, and Evaluation , 1989 .

[43]  Trevor Darrell,et al.  Integrated Person Tracking Using Stereo, Color, and Pattern Detection , 2000, International Journal of Computer Vision.