GraphMatch: Efficient Large-Scale Graph Construction for Structure from Motion

We present GraphMatch, an approximate yet efficient method for building the matching graph for large-scale structure-from-motion~(SfM) pipelines. GraphMatch leverages two priors that can predict which image pairs are likely to match, thereby making the matching process for SfM much more efficient. The first is a score computed from the distance between the Fisher vectors of any two images. The second prior is based on the graph distance between vertices in the underlying matching graph. GraphMatch combines these two priors into an iterative ``sample-and-propagate'' scheme similar to the PatchMatch algorithm. Its sampling stage uses Fisher similarity priors to guide the search for matching image pairs, while its propagation stage explores neighbors of matched pairs to find new ones with a high image similarity score. Our experiments show that GraphMatch finds the most image pairs as compared to competing, approximate methods while at the same time being the most efficient.

[1]  Tomás Pajdla,et al.  Robust Rotation and Translation Estimation in Multiview Reconstruction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Hanqing Lu,et al.  Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[6]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[7]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[8]  Richard Szeliski,et al.  Skeletal graphs for efficient structure from motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Jan-Michael Frahm,et al.  PAIGE: PAirwise Image Geometry Encoding for improved efficiency in Structure-from-Motion , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Venu Madhav Govindu,et al.  Robustness in Motion Averaging , 2006, ACCV.

[11]  Tobias Höllerer,et al.  Theia: A Fast and Scalable Structure-from-Motion Library , 2015, ACM Multimedia.

[12]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Julien Pilet,et al.  Size Matters: Exhaustive Geometric Verification for Image Retrieval Accepted for ECCV 2012 , 2012, ECCV.

[14]  Pascal Monasse,et al.  Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion , 2013, ICCV.

[15]  Venu Madhav Govindu,et al.  Efficient and Robust Large-Scale Rotation Averaging , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Zuzana Kukelova,et al.  3D reconstruction from image collections with a single known focal length , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Matthew Turk,et al.  ANSAC: Adaptive Non-minimal Sample and Consensus , 2017, BMVC.

[18]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[20]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[21]  Ping Tan,et al.  A Global Linear Method for Camera Pose Registration , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Adam Finkelstein,et al.  The Generalized PatchMatch Correspondence Algorithm , 2010, ECCV.

[25]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[26]  Matthew Turk,et al.  EVSAC: Accelerating Hypotheses Generation by Modeling Matching Scores with Extreme Value Theory , 2013, 2013 IEEE International Conference on Computer Vision.

[27]  Kenichi Kanatani,et al.  Closed-Form Expression for Focal Lengths from the Fundamental Matrix , 2000 .

[28]  Richard Szeliski,et al.  Bundle Adjustment in the Large , 2010, ECCV.

[29]  Florent Perronnin,et al.  Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Tobias Höllerer,et al.  Optimizing the Viewing Graph for Structure-from-Motion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[34]  Onur Özyesil,et al.  Robust camera location estimation by convex programming , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Noah Snavely,et al.  Robust Global Translations with 1DSfM , 2014, ECCV.

[36]  Eli Shechtman,et al.  Image melding , 2012, ACM Trans. Graph..

[37]  Jochen Trumpf,et al.  L1 rotation averaging using the Weiszfeld algorithm , 2011, CVPR 2011.

[38]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[39]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[40]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[42]  Jan-Michael Frahm,et al.  Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) , 2015, CVPR 2015.

[43]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Long Quan,et al.  Graph-Based Consistent Matching for Structure-from-Motion , 2016, ECCV.