Visibility learning in large-scale urban environment

A crucial step in many vision based applications, such as localization and structure from motion, is the data association between a large map of known 3D points and 2D features perceived by a new camera. In this paper, we propose a novel approach to predict the visibility of known 3D points with respect to a query camera in large-scale environments. In our approach, we model the visibility of each 3D point with respect to a camera pose using a memory-based learning algorithm, in which a distance metric between cameras is learned in an entirely non-parametric way. We show that by fully exploiting the geometric relationships between the 3D map and the camera poses, as well as the related appearance information, the resulting prediction is much more robust and efficient than conventional approaches. We demonstrate the performance of our algorithm on a large urban 3D model in terms of both speed and accuracy.

[1]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Frank Dellaert,et al.  Learning visibility of landmarks for vision-based localization , 2010, 2010 IEEE International Conference on Robotics and Automation.

[3]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[4]  Noah Snavely Photo Tourism : Exploring image collections in 3D , 2006 .

[5]  Kilian Q. Weinberger,et al.  Metric Learning for Kernel Regression , 2007, AISTATS.

[6]  Didier Stricker,et al.  Feature Management for Efficient Camera Tracking , 2007, ACCV.

[7]  Kilian Q. Weinberger,et al.  Convex Optimizations for Distance Metric Learning and Pattern Classification [Applications Corner] , 2010, IEEE Signal Processing Magazine.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[11]  Frank Dellaert,et al.  Out-of-Core Bundle Adjustment for Large-Scale 3D Reconstruction , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[14]  Antonio Criminisi,et al.  Epitomic location recognition , 2008, CVPR.

[15]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Yanxi Liu,et al.  Detecting and matching repeated patterns for automatic geo-tagging in urban environments , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[18]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[19]  David W. Murray,et al.  Improving the Agility of Keyframe-Based SLAM , 2008, ECCV.

[20]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Supun Samarasekera,et al.  Real-time global localization with a pre-built visual landmark database , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..