Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles

We present a novel approach for multi-object tracking which considers object detection and spacetime trajectory estimation as a coupled optimization problem. Our approach is formulated in a minimum description length hypothesis selection framework, which allows our system to recover from mismatches and temporarily lost tracks. Building upon a state-of-the-art object detector, it performs multiview/multicategory object recognition to detect cars and pedestrians in the input images. The 2D object detections are checked for their consistency with (automatically estimated) scene geometry and are converted to 3D observations which are accumulated in a world coordinate frame. A subsequent trajectory estimation module analyzes the resulting 3D observations to find physically plausible spacetime trajectories. Tracking is achieved by performing model selection after every frame. At each time instant, our approach searches for the globally optimal set of spacetime trajectories which provides the best explanation for the current image and for all evidence collected so far while satisfying the constraints that no two objects may occupy the same physical space nor explain the same image pixels at any point in time. Successful trajectory hypotheses are then fed back to guide object detection in future frames. The optimization procedure is kept efficient through incremental computation and conservative hypothesis pruning. We evaluate our approach on several challenging video sequences and demonstrate its performance on both a surveillance-type scenario and a scenario where the input videos are taken from inside a moving vehicle passing through crowded city areas.

[1]  A.H. Haddad,et al.  Applied optimal estimation , 1976, Proceedings of the IEEE.

[2]  D. Reid An algorithm for tracking multiple targets , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[3]  Yaakov Bar-Shalom,et al.  Sonar tracking of multiple targets using joint probabilistic data association , 1983 .

[4]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  L. Davis,et al.  Real-time multiple vehicle detection and tracking from a moving vehicle , 2000, Machine Vision and Applications.

[7]  Larry S. Davis,et al.  Pedestrian tracking from a moving vehicle , 2000, Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No.00TH8511).

[8]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[9]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Endre Boros,et al.  Pseudo-Boolean optimization , 2002, Discret. Appl. Math..

[11]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Luc Van Gool,et al.  An adaptive color-based particle filter , 2003, Image Vis. Comput..

[13]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Dariu Gavrila,et al.  A Bayesian Framework for Multi-cue 3D Object Tracking , 2004, ECCV.

[16]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[17]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[18]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[20]  Ingemar J. Cox,et al.  A review of statistical data association techniques for motion correspondence , 1993, International Journal of Computer Vision.

[21]  Ruzena Bajcsy,et al.  Segmentation of range images as the search for geometric parametric models , 1995, International Journal of Computer Vision.

[22]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Hans-Hellmut Nagel,et al.  Model-based object tracking in monocular image sequences of road traffic scenes , 1993, International Journal of Computer 11263on.

[24]  A. G. Amitha Perera,et al.  A unified framework for tracking through occlusions and across sensor gaps , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[26]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[28]  Cordelia Schmid,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[29]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Oswald Lanz,et al.  Approximate Bayesian multibody tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jens Keuchel Multiclass Image Labeling with Semidefinite Programming , 2006, ECCV.

[32]  Konrad Schindler,et al.  Perspective n-View Multibody Structure-and-Motion Through Model Selection , 2006, ECCV.

[33]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[34]  William J. Christmas,et al.  A Novel Data Association Algorithm for Object Tracking in Clutter with Application to Tennis Video Analysis , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[35]  Andrew Zisserman,et al.  Solving Markov Random Fields using Second Order Cone Programming Relaxations , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36]  Bernt Schiele,et al.  Segmentation Based Multi-Cue Integration for Object Detection , 2006, BMVC.

[37]  Luc Van Gool,et al.  Fast Compact City Modeling for Navigation Pre-Visualization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  Pascal Fua,et al.  Robust People Tracking with Global Trajectory Optimization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[39]  Ramakant Nevatia,et al.  Tracking of Multiple, Partially Occluded Humans based on Static Body Part Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[40]  J Quinonero Candela,et al.  Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.

[41]  Bernt Schiele,et al.  Multiple Object Class Detection with a Generative Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[43]  Luc Van Gool,et al.  3D Urban Scene Modeling Integrating Recognition and Reconstruction , 2008, International Journal of Computer Vision.

[44]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[45]  Luc Van Gool,et al.  Coupled Detection and Trajectory Estimation for Multi-Object Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[46]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Vladimir Kolmogorov,et al.  Optimizing Binary MRFs via Extended Roof Duality , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.