Multi-and Single View Multiperson Tracking for Smart Room Environments

Simultaneous tracking of multiple persons in real world environments is an active research field and several approaches have been proposed, based on a variety of features and algorithms. In this work, we present 2 multimodal systems for tracking multiple users in a smart room environment. One is a multi-view tracker based on color histogram tracking and special person region detectors. The other is a wide angle overhead view person tracker relying on foreground segmentation and model-based tracking. Both systems are completed by a joint probabilistic data association filter-based source localization framework using input from several microphone arrays. We also very briefly present two intuitive metrics to allow for objective comparison of tracker characteristics, focusing on their precision in estimating object locations, their accuracy in recognizing object configurations and their ability to consistently label objects over time. The trackers are extensively tested and compared, for each modality separately, and for the combined modalities, on the CLEAR 2006 Evaluation Database.

[1]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[2]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Hai Tao,et al.  A Sampling Algorithm for Tracking Multiple Objects , 1999, Workshop on Vision Algorithms.

[4]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Rainer Stiefelhagen,et al.  Multiple Object Tracking Performance Metrics and Evaluation in a Smart Room Environment , 2006 .

[6]  Rainer Stiefelhagen,et al.  Detection-Assisted Initialization, Adaptation and Fusion of Body Region Trackers for Robust Multiperson Tracking , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[7]  Larry S. Davis,et al.  M2Tracker: A Multi-view Approach to Segmenting and Tracking People in a Cluttered Scene Using Region-Based Stereo , 2002, ECCV.

[8]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[9]  Rainer Stiefelhagen,et al.  Pointing gesture recognition based on 3D-tracking of face, hands and head orientation , 2003, ICMI '03.

[10]  Yuan-Fang Wang,et al.  Real-time multiperson tracking in video surveillance , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[11]  John W. McDonough,et al.  Tracking Multiple Speakers with Probabilistic Data Association Filters , 2006, CLEAR.

[12]  Rainer Stiefelhagen,et al.  Towards vision-based 3-D people tracking in a smart room , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[13]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[15]  Shaogang Gong,et al.  Tracking and segmenting people in varying lighting conditions using colour , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[16]  Trevor Darrell,et al.  A Probabilistic Framework for Multi-modal Multi-Person Tracking , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[17]  S. Intille,et al.  Improving Multiple People Tracking Using Temporal Consistency , .