论文信息 - Accurate Single Image Multi-modal Camera Pose Estimation

Accurate Single Image Multi-modal Camera Pose Estimation

A well known problem in photogrammetry and computer vision is the precise and robust determination of camera poses with respect to a given 3D model. In this work we propose a novel multi-modal method for single image camera pose estimation with respect to 3D models with intensity information (e.g., LiDAR data with reflectance information). We utilize a direct point based rendering approach to generate synthetic 2D views from 3D datasets in order to bridge the dimensionality gap. The proposed method then establishes 2D/2D point and local region correspondences based on a novel self-similarity distance measure. Correct correspondences are robustly identified by searching for small regions with a similar geometric relationship of local self-similarities using a Generalized Hough Transform. After backprojection of the generated features into 3D a standard Perspective-n-Points problem is solved to yield an initial camera pose. The pose is then accurately refined using an intensity based 2D/3D registration approach. An evaluation on Vis/IR 2D and airborne and terrestrial 3D datasets shows that the proposed method is applicable to a wide range of different sensor types. In addition, the approach outperforms standard global multi-modal 2D/3D registration approaches based on Mutual Information with respect to robustness and speed. Potential applications are widespread and include for instance multi-spectral texturing of 3D models, SLAM applications, sensor data fusion and multi-spectral camera calibration and super-resolution applications.

[1] Changchang Wu,et al. SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[2] Lu Wang,et al. A robust approach for automatic registration of aerial images with untextured aerial LiDAR data , 2009, CVPR.

[3] Gregory D. Hager,et al. Fast and Globally Convergent Pose Estimation from Video Images , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4] Larry S. Davis,et al. Model-based object pose in 25 lines of code , 1992, International Journal of Computer Vision.

[5] Alexandru Vasile,et al. Automatic Alignment of Color Imagery onto 3D Laser Radar Data , 2006, 35th IEEE Applied Imagery and Pattern Recognition Workshop (AIPR'06).

[6] Cordelia Schmid,et al. A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[7] Zhengyou Zhang,et al. A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Andrew J. Davison,et al. Active Matching , 2008, ECCV.

[9] Paul A. Viola,et al. Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[10] Jan-Michael Frahm,et al. A Comparative Analysis of RANSAC Techniques Leading to Adaptive Real-Time Random Sample Consensus , 2008, ECCV.

[11] William E. Lorensen,et al. The visualization toolkit (2nd ed.): an object-oriented approach to 3D graphics , 1998 .

[12] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[13] David G. Lowe,et al. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[14] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[15] John W. Fisher,et al. Automatic registration of LIDAR and optical images of urban scenes , 2009, CVPR.

[16] Bernt Schiele,et al. Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[17] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[18] William Schroeder,et al. The Visualization Toolkit: An Object-Oriented Approach to 3-D Graphics , 1997 .

[19] George Vosselman,et al. Airborne and terrestrial laser scanning , 2011, Int. J. Digit. Earth.

[20] Jürgen Weese,et al. A comparison of similarity measures for use in 2-D-3-D medical image registration , 1998, IEEE Transactions on Medical Imaging.

[21] Andrew Zisserman,et al. Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[22] V. Lepetit,et al. EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[23] Markus Gross,et al. Point-Based Graphics , 2007 .

[24] Selim Benhimane,et al. Homography-based 2D Visual Tracking and Servoing , 2007, Int. J. Robotics Res..

[25] Eli Shechtman,et al. Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Philip David,et al. SoftPOSIT: Simultaneous Pose and Correspondence Determination , 2002, ECCV.

[27] W. Wagner,et al. Gaussian decomposition and calibration of a novel small-footprint full-waveform digitising airborne laser scanner , 2006 .

[28] Avideh Zakhor,et al. Automatic registration of aerial imagery with untextured 3D LiDAR models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.