Classification and Pose Estimation of Vehicles in Videos by 3D Modeling within Discrete-Continuous Optimization

This paper presents a framework for classification and pose estimation of vehicles in videos by assuming their given 3D models. We rank possible poses and types for each frame and exploit temporal coherence between consecutive frames for refinement. As a novelty, first, we cast the estimation of a vehicle's pose and type as a solution of a continuous optimization problem over space and time. Due to a non-convexity of this problem, good initial starting points are important. We propose to obtain them by a discrete temporal optimization reaching a global optimum on a ranked discrete set of possible types and poses. Second, to guarantee effectiveness of the proposed discrete-continuous optimization, we present a novel way to efficiently reduce the search space of potential 3D model types and poses for each frame for the discrete optimizer. It avoids common expensive evaluation of all possible discretized hypotheses. The key idea towards efficiency lies in a novel combination of detecting the vehicle, rendering the 3D models, matching projected edges to input images, and using a tree structured Markov Random Field to get fast and globally optimal inference and to force the vehicle follow a feasible motion model in the initial phase. Quantitative and qualitative experiments on a variety of videos with vast variation of vehicle types show superior results to state-of-the-art methods.

[1]  Rama Chellappa,et al.  Fast directional chamfer matching , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Dieter Koller,et al.  Moving Object Recognition and Classification based on Recursive Shape Parameter Estimation , 1993, CVPR 1993.

[3]  Tieniu Tan,et al.  3D model based vehicle localization by optimizing local gradient based fitness evaluation , 2008, 2008 19th International Conference on Pattern Recognition.

[4]  Rama Chellappa,et al.  Pose estimation in heavy clutter using a multi-flash camera , 2010, 2010 IEEE International Conference on Robotics and Automation.

[5]  Luc Van Gool,et al.  Towards Multi-View Object Class Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Hui Cheng,et al.  3D model based vehicle classification in aerial imagery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Alexander Toshev,et al.  Shape-based object recognition in videos using 3D synthetic object models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Tieniu Tan,et al.  3-D model-based vehicle tracking , 2005, IEEE Transactions on Image Processing.

[10]  P. Fua,et al.  Pose estimation for category specific multiview object localization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Geoffrey D. Sullivan,et al.  A Generic Deformable Model for Vehicle Recognition , 1995, BMVC.

[12]  Joseph L. Mundy,et al.  Vehicle Surveillance with a Generic, Adaptive, 3D Vehicle Model , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Roland Siegwart,et al.  Introduction to Autonomous Mobile Robots , 2004 .

[14]  Cordelia Schmid,et al.  Multi-view object class detection with a 3D geometric model , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Ronen Basri,et al.  Viewpoint-aware object detection and pose estimation , 2011, 2011 International Conference on Computer Vision.

[16]  Pietro Perona,et al.  A sparse object category model for efficient learning and exhaustive recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Roland Siegwart,et al.  Real-time monocular visual odometry for on-road vehicles with 1-point RANSAC , 2009, 2009 IEEE International Conference on Robotics and Automation.

[18]  Sinisa Todorovic,et al.  From contours to 3D object detection and pose estimation , 2011, 2011 International Conference on Computer Vision.

[19]  Supun Samarasekera,et al.  Matching vehicles under large pose transformations using approximate 3D models and piecewise MRF model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Cordelia Schmid,et al.  Viewpoint-independent object class detection using 3D Feature Maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.