Semantic Correspondence with Geometric Structure Analysis

This article studies the correspondence problem for semantically similar images, which is challenging due to the joint visual and geometric deformations. We introduce the Flip-aware Distance Ratio method (FDR) to solve this problem from the perspective of geometric structure analysis. First, a distance ratio constraint is introduced to enforce the geometric consistencies between images with large visual variations, whereas local geometric jitters are tolerated via a smoothness term. For challenging cases with symmetric structures, our proposed method exploits Curl to suppress the mismatches. Subsequently, image correspondence is formulated as a permutation problem, for which we propose a Gradient Guided Simulated Annealing (GGSA) algorithm to perform a robust discrete optimization. Experiments on simulated and real-world datasets, where both visual and geometric deformations are present, indicate that our method significantly improves the baselines for both visually and semantically similar images.

[1]  Stephen Lin,et al.  DCTM: Discrete-Continuous Transformation Matching for Semantic Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Zhiyu Wang,et al.  Bilateral Correspondence Model for Words-and-Pictures Association in Multimedia-Rich Microblogs , 2014, TOMM.

[3]  Stephen Lin,et al.  Recurrent Transformer Networks for Semantic Correspondence , 2018, NeurIPS.

[4]  Jean Ponce,et al.  Finding Matches in a Haystack: A Max-Pooling Strategy for Graph Matching in the Presence of Outliers , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Xiaowei Zhou,et al.  Multi-image Matching via Fast Alternating Minimization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[8]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[9]  Chong-Wah Ngo,et al.  Flip-Invariant SIFT for Copy and Object Detection , 2013, IEEE Transactions on Image Processing.

[10]  Jan Feyereisl,et al.  Online Multi-target Tracking by Large Margin Structured Learning , 2012, ACCV.

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Mirko Krivánek,et al.  Simulated Annealing: A Proof of Convergence , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Chiou-Ting Hsu,et al.  Image retrieval with relevance feedback based on graph-theoretic region correspondence estimation , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[14]  Dong Liang,et al.  MatchDR: Image Correspondence by Leveraging Distance Ratio Constraint , 2016, ACM Multimedia.

[15]  Josef Sivic,et al.  Convolutional Neural Network Architecture for Geometric Matching , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Ce Liu,et al.  Deformable Spatial Pyramid Matching for Fast Dense Correspondences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Richard Szeliski,et al.  Structure from motion for scenes with large duplicate structures , 2011, CVPR 2011.

[18]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[19]  Shiming Ge,et al.  Detecting Masked Faces in the Wild with LLE-CNNs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Antonio Torralba,et al.  Nonparametric Scene Parsing via Label Transfer , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Bernt Schiele,et al.  Learning Video Object Segmentation from Static Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jean Ponce,et al.  Learning Graphs to Match , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Jorge S. Marques,et al.  Automatic Estimation of Multiple Motion Fields From Video Sequences Using a Region Matching Based Approach , 2014, IEEE Transactions on Multimedia.

[24]  Radu Timofte,et al.  GLU-Net: Global-Local Universal Network for Dense Flow and Correspondences , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Lihi Zelnik-Manor,et al.  On SIFTs and their scales , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Jean Ponce,et al.  Learning Semantic Correspondence Exploiting an Object-level Prior , 2020, IEEE transactions on pattern analysis and machine intelligence.

[27]  Jean Ponce,et al.  A graph-matching kernel for object categorization , 2011, 2011 International Conference on Computer Vision.

[28]  Vincent Lepetit,et al.  BRIEF: Computing a Local Binary Descriptor Very Fast , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Minsu Cho,et al.  Reweighted Random Walks for Graph Matching , 2010, ECCV.

[30]  Michael Felsberg,et al.  Learning What to Learn for Video Object Segmentation , 2020, ECCV.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  Minsu Cho,et al.  Graph Matching via Sequential Monte Carlo , 2012, ECCV.

[33]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[34]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[35]  Stephen Lin,et al.  Discrete-Continuous Transformation Matching for Dense Semantic Correspondence , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Seungryong Kim,et al.  FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Satpute Bhagyashri MULTIMEDIA INFORMATION RETRIEVAL BASED ON LATE SEMANTIC FUSION APPROACHES- EXPERIMENTS ON A WIKIPEDIA IMAGE COLLECTION , 2015 .

[38]  Simon Lucey,et al.  Dense Semantic Correspondence Where Every Pixel is a Classifier , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Sang Uk Lee,et al.  Correspondence Matching of Multi-View Video Sequences Using Mutual Information Based Similarity Measure , 2013, IEEE Transactions on Multimedia.

[40]  E. Lawler The Quadratic Assignment Problem , 1963 .

[41]  Seungryong Kim,et al.  Joint Learning of Semantic Alignment and Object Landmark Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Ulrike von Luxburg,et al.  Local Ordinal Embedding , 2014, ICML.

[43]  Vikas Singh,et al.  Solving the multi-way matching problem by permutation synchronization , 2013, NIPS.

[44]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[45]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Trevor Darrell,et al.  Hierarchical Discrete Distribution Decomposition for Match Density Estimation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  K.-K. Maninis,et al.  Video Object Segmentation without Temporal Information , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Björn Ommer,et al.  Deep Semantic Feature Matching , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Sergio Guadarrama,et al.  Tracking Emerges by Colorizing Videos , 2018, ECCV.

[50]  Douglas C. Schmidt,et al.  A Fast Backtracking Algorithm to Test Directed Graphs for Isomorphism Using Distance Matrices , 1976, J. ACM.

[51]  Xuming He,et al.  Dynamic Context Correspondence Network for Semantic Alignment , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[52]  Sang Chul Ahn,et al.  Generalized Deformable Spatial Pyramid: Geometry-preserving dense correspondence estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Kamil Adamczewski,et al.  Subgraph matching using compactness prior for robust feature correspondence , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Martial Hebert,et al.  An Integer Projected Fixed Point Method for Graph Matching and MAP Inference , 2009, NIPS.

[56]  Andrea Vedaldi,et al.  AnchorNet: A Weakly Supervised Network to Learn Geometry-Sensitive Features for Semantic Matching , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Xiaojie Wang,et al.  Correspondence Autoencoders for Cross-Modal Retrieval , 2015, ACM Trans. Multim. Comput. Commun. Appl..

[58]  Josef Sivic,et al.  End-to-End Weakly-Supervised Semantic Alignment , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[59]  E DruffelLarry,et al.  A Fast Backtracking Algorithm to Test Directed Graphs for Isomorphism Using Distance Matrices , 1976 .

[60]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.