Distraction suppression for vision-based pose estimation at city scales

This paper is concerned with the problem of egomotion estimation in highly dynamic, heavily cluttered urban environments over long periods of time. This is a challenging problem for vision-based systems because extreme scene movement caused by dynamic objects (e.g., enormous buses) can result in erroneous motion estimates. We describe two methods that combine 3D scene priors with vision sensors to generate background-likelihood images, which act as probability masks for objects that are not part of the scene prior. This results in a system that is able to cope with extreme scene motion, even when most of the image is obscured. We present results on real data collected in central London during rush hour and demonstrate the benefits of our techniques on a core navigation system - visual odometry.

[1]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[2]  P. Anandan,et al.  A unified approach to moving object detection in 2D and 3D scenes , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[3]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Jan-Olof Eklundh,et al.  Statistical background subtraction for a mobile observer , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Larry H. Matthies,et al.  Real-time detection of moving objects in a dynamic scene from moving robotic vehicles , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[7]  Chin-Seng Chua,et al.  Statistical background modeling for non-stationary camera , 2003, Pattern Recognit. Lett..

[8]  Massimo Piccardi,et al.  Background subtraction techniques: a review , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[9]  Tom Drummond,et al.  Real-Time Video Annotations for Augmented Reality , 2005, ISVC.

[10]  Sebastian Thrun,et al.  Map-Based Precision Vehicle Localization in Urban Environments , 2007, Robotics: Science and Systems.

[11]  Gérard G. Medioni,et al.  Detecting Motion Regions in the Presence of a Strong Parallax from a Moving Camera by Multiview Geometric Constraints , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Wolfram Burgard,et al.  Map-Based Precision Vehicle Localization in Urban Environments , 2008 .

[13]  Luc Van Gool,et al.  Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Michael Bosse,et al.  Map Matching and Data Association for Large-Scale Two-dimensional Laser Scan-based SLAM , 2008, Int. J. Robotics Res..

[15]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[16]  Takeo Kanade,et al.  Background Subtraction for Freely Moving Cameras , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Christoph Stiller,et al.  Automated map generation from aerial images for precise vehicle localization , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[18]  Andreas Geiger,et al.  Efficient Large-Scale Stereo Matching , 2010, ACCV.

[19]  Paul Newman,et al.  Real-time bounded-error pose estimation for road vehicles using vision , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[20]  Luc Van Gool,et al.  Object Detection and Tracking for Autonomous Navigation in Dynamic Environments , 2010, Int. J. Robotics Res..

[21]  Marc Pollefeys,et al.  Image based detection of geometric changes in urban environments , 2011, 2011 International Conference on Computer Vision.

[22]  Davide Scaramuzza,et al.  Performance evaluation of 1‐point‐RANSAC visual odometry , 2011, J. Field Robotics.

[23]  Bastian Leibe,et al.  Level-set person segmentation and tracking with multi-region appearance models and top-down shape information , 2011, 2011 International Conference on Computer Vision.

[24]  Paul Newman,et al.  LAPS - localisation using appearance of prior structure: 6-DoF monocular camera localisation using prior pointclouds , 2012, 2012 IEEE International Conference on Robotics and Automation.

[25]  Paul Newman,et al.  Practice makes perfect? Managing and leveraging visual experiences for lifelong navigation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[26]  Paul Newman,et al.  Road vehicle localization with 2D push-broom LIDAR and 3D priors , 2012, 2012 IEEE International Conference on Robotics and Automation.

[27]  Paul Newman,et al.  What could move? Finding cars, pedestrians and bicyclists in 3D laser data , 2012, 2012 IEEE International Conference on Robotics and Automation.