论文信息 - Progressive Large Scale-Invariant Image Matching in Scale Space

Progressive Large Scale-Invariant Image Matching in Scale Space

The power of modern image matching approaches is still fundamentally limited by the abrupt scale changes in images. In this paper, we propose a scale-invariant image matching approach to tackling the very large scale variation of views. Drawing inspiration from the scale space theory, we start with encoding the image’s scale space into a compact multi-scale representation. Then, rather than trying to find the exact feature matches all in one step, we propose a progressive two-stage approach. First, we determine the related scale levels in scale space, enclosing the inlier feature correspondences, based on an optimal and exhaustive matching in a limited scale space. Second, we produce both the image similarity measurement and feature correspondences simultaneously after restricting matching between the related scale levels in a robust way. The matching performance has been intensively evaluated on vision tasks including image retrieval, feature matching and Structurefrom- Motion (SfM). The successful integration of the challenging fusion of high aerial and low ground-level views with significant scale differences manifests the superiority of the proposed approach.

[1] Cordelia Schmid,et al. Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[2] Jiri Matas,et al. Image Retrieval for Online Browsing in Large Image Collections , 2013, SISAP.

[3] Martha Larson,et al. Pairwise geometric matching for large-scale object retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Jan-Michael Frahm,et al. From single image query to detailed 3D reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Jan-Michael Frahm,et al. Building Rome on a Cloudless Day , 2010, ECCV.

[6] Jiri Matas,et al. MODS: Fast and robust method for two-view matching , 2015, Comput. Vis. Image Underst..

[7] Simon Osindero,et al. Cross-Dimensional Weighting for Aggregated Deep Convolutional Features , 2015, ECCV Workshops.

[8] Cordelia Schmid,et al. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[9] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[10] Long Quan,et al. A quasi-dense approach to surface reconstruction from uncalibrated images , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11] Jean-Michel Morel,et al. ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[12] C. Schmid,et al. On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Long Quan,et al. Efficient Multi-view Surface Refinement with Adaptive Resolution Control , 2016, ECCV.

[14] Long Quan. Image-Based Modeling , 2009, accv 2009.

[15] David G. Lowe,et al. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[16] Tony Lindeberg,et al. Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[17] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18] David G. Lowe,et al. Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Luc Van Gool,et al. Efficient volumetric fusion of airborne and street-side data for urban reconstruction , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[20] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21] Gary R. Bradski,et al. ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[22] Ondrej Chum,et al. CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[23] Jean Ponce,et al. Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] Jiri Matas,et al. Total recall II: Query expansion revisited , 2011, CVPR 2011.

[25] Jianxiong Xiao,et al. Local Readjustment for High-Resolution 3D Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Long Quan,et al. Image-Based Building Regularization Using Structural Linear Features , 2016, IEEE Transactions on Visualization and Computer Graphics.

[27] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[28] Pascal Fua,et al. LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Jiri Matas,et al. Efficient Image Detail Mining , 2014, ACCV.

[30] Long Quan,et al. Joint Camera Clustering and Surface Segmentation for Large-Scale Multi-view Stereo , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31] Renaud Marlet,et al. Virtual Line Descriptor and Semi-Local Graph Matching Method for Reliable Feature Correspondence , 2012, BMVC.

[32] Changhu Wang,et al. Spatial-bag-of-features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33] Long Quan,et al. Resampling Structure from Motion , 2010, ECCV.

[34] Jianxiong Xiao,et al. Image-based street-side city modeling , 2009, ACM Trans. Graph..

[35] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Long Quan,et al. Parallel Structure from Motion from Local Increment to Global Averaging , 2017 .

[37] Steven M. Seitz,et al. Accurate Geo-Registration by Ground-to-Aerial Image Matching , 2014, 2014 2nd International Conference on 3D Vision.

[38] Jan-Michael Frahm,et al. Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Noah Snavely. Photo Tourism : Exploring image collections in 3D , 2006 .

[40] Tony Lindeberg,et al. Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[41] Jean-Philippe Pons,et al. Towards high-resolution large-scale multi-view stereo , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[42] Long Quan,et al. Graph-Based Consistent Matching for Structure-from-Motion , 2016, ECCV.

[43] Renaud Marlet,et al. Virtual Line Descriptor and Semi-Local Matching Method for Reliable Feature Correspondence , 2012 .

[44] Hanqing Lu,et al. Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45] Zhen Wang,et al. A Multiscale and Hierarchical Feature Extraction Method for Terrestrial Laser Scanning Point Cloud Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[46] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[47] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[48] Andrew Zisserman,et al. All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[49] Luc Van Gool,et al. SURF: Speeded Up Robust Features , 2006, ECCV.

[50] Richard Szeliski,et al. Building Rome in a day , 2009, ICCV.

[51] Changchang Wu,et al. SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[52] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.