A Three-level Architecture for Model-free Detection and Tracking of Independently Moving Objects

We present a three–level architecture for detection and tracking of independently moving objects (IMOs) in sequences recorded from a moving vehicle. At the first stage, image pixels with an optical flow that is not entirely induced by the car’s motion are detected by combining dense optical flow, egomotion extracted from this optical flow, and dense stereo. These pixels are segmented and an attention mechanism is used to process them at finer resolution at the second level making use of sparse 2D and 3D edge descriptors. Based on the rich and precise information on the second level, the full rigid motion for the environment and for each IMO is computed. This motion information is then used for tracking, filtering and the building of a 3D model of the street structure as well as the IMO. This multi-level architecture allows us to combine the strength of both dense and sparse processing methods in terms of precision and computational complexity, and to dedicate more processing capacity to the important parts of the scene (the IMOs).

[1]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, CVPR.

[2]  Markus Lappe,et al.  Biologically Motivated Multi-modal Processing of Visual Primitives , 2003 .

[3]  Marc M. Van Hulle,et al.  Optic flow from unstable sequences through local velocity constancy maximization , 2009, Image Vis. Comput..

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[6]  Norbert Krüger,et al.  Comparison of Point and Line Features and Their Combination for Rigid Body Motion Estimation , 2009, Statistical and Geometrical Approaches to Visual Motion Analysis.

[7]  Sebastian Thrun,et al.  FastSLAM: a factored solution to the simultaneous localization and mapping problem , 2002, AAAI/IAAI.

[8]  G. Caputo,et al.  Attention mechanisms in computer vision systems , 1995, Proceedings of Conference on Computer Architectures for Machine Perception.

[9]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[10]  Luc Van Gool,et al.  Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Massimo Bertozzi,et al.  GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection , 1998, IEEE Trans. Image Process..

[12]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[13]  Bodo Rosenhahn,et al.  Tracking with a Novel Pose Estimation Algorithm , 2001, RobVis.

[14]  S. Rushton,et al.  The pop out of scene-relative object movement against retinal motion due to self-movement , 2007, Cognition.

[15]  Nicolas Pugeault,et al.  Early cognitive vision: feedback mechanisms for the disambiguation of early visual representation , 2008 .

[16]  Marc M. Van Hulle,et al.  Realtime phase-based optical flow on the GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[17]  Manolis I. A. Lourakis,et al.  Independent 3D motion detection using residual parallax normal flow fields , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[18]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[19]  Hugh F. Durrant-Whyte,et al.  Simultaneous Localization and Mapping with Sparse Extended Information Filters , 2004, Int. J. Robotics Res..

[20]  Ingemar J. Cox,et al.  A review of statistical data association techniques for motion correspondence , 1993, International Journal of Computer Vision.

[21]  Marc M. Van Hulle,et al.  Optimal instantaneous rigid motion estimation insensitive to local minima , 2006, Comput. Vis. Image Underst..

[22]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[23]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[24]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  William B. Thompson,et al.  Detecting moving objects , 1989, International Journal of Computer Vision.

[26]  L. Davis,et al.  Real-time multiple vehicle detection and tracking from a moving vehicle , 2000, Machine Vision and Applications.

[27]  Yaakov Bar-Shalom,et al.  Tracking methods in a multitarget environment , 1978 .

[28]  Fabio Solari,et al.  Compact (and accurate) early vision processing in the harmonic space , 2007, VISAPP.

[29]  Harpreet S. Sawhney,et al.  Independent motion detection in 3D scenes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.