Robust Multi-Person Tracking from Moving Platforms

In this paper, we address the problem of multi-person tracking in busy pedestrian zones, using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution, which extracts as much visual information as possible and combines it through cognitive feedback. We propose such an approach, which jointly estimates camera position, stereo depth, object detection, and tracking. We model the interplay between these components using a graphical model. Since the model has to incorporate object-object interactions, and temporal links to past frames, direct inference is intractable. We therefore propose a two-stage procedure: for each frame we first solve a simplified version of the model (disregarding interactions and temporal continuity) to estimate the scene geometry and an overcomplete set of object detections. Conditioned on these results, we then address object interactions, tracking, and prediction in a second step. The approach is experimentally evaluated on several long and difficult video sequences from busy inner-city locations. Our results show that the proposed integration makes it possible to deliver stable tracking performance in scenes of realistic complexity.

[1]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[2]  Luc Van Gool,et al.  A mobile vision system for robust multi-person tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Fatih Murat Porikli,et al.  Human Detection via Classification on Riemannian Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[8]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Luc Van Gool,et al.  Robust Multiperson Tracking from a Mobile Platform , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Ingemar J. Cox,et al.  A review of statistical data association techniques for motion correspondence , 1993, International Journal of Computer Vision.

[11]  James W. Davis,et al.  Integrating Appearance and Motion Cues for Simultaneous Detection and Segmentation of Pedestrians , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Dariu Gavrila,et al.  A Bayesian Framework for Multi-cue 3D Object Tracking , 2004, ECCV.

[13]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[14]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  A. G. Amitha Perera,et al.  A unified framework for tracking through occlusions and across sensor gaps , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  A.H. Haddad,et al.  Applied optimal estimation , 1976, Proceedings of the IEEE.

[19]  Mubarak Shah,et al.  3D Model based Object Class Detection in An Arbitrary View , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[21]  David Nistér,et al.  A Minimal Solution to the Generalised 3-Point Pose Problem , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[22]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[23]  Luc Van Gool,et al.  Coupled Detection and Trajectory Estimation for Multi-Object Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Joachim M. Buhmann,et al.  Object Categorization by Compositional Graphical Models , 2005, EMMCVPR.

[25]  Ramakant Nevatia,et al.  Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[28]  Andrew J. Davison,et al.  Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[29]  Jean-Philippe Tarel,et al.  Real time obstacle detection in stereovision on non flat road geometry through "v-disparity" representation , 2002, Intelligent Vehicle Symposium, 2002. IEEE.

[30]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[31]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[32]  Pascal Fua,et al.  Robust People Tracking with Global Trajectory Optimization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[37]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[38]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[39]  Oswald Lanz,et al.  Approximate Bayesian multibody tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  R. Nevatia,et al.  Simultaneous Object Detection and Segmentation by Boosting Local Shape Feature based Classifier , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Dariu Gavrila,et al.  Multi-cue Pedestrian Detection and Tracking from a Moving Vehicle , 2007, International Journal of Computer Vision.

[42]  Yaakov Bar-Shalom,et al.  Sonar tracking of multiple targets using joint probabilistic data association , 1983 .

[43]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.