Coarse to fine non-rigid registration: a chain of scale-specific neural networks for multimodal image alignment with application to remote sensing

We tackle here the problem of multimodal image non-rigid registration, which is of prime importance in remote sensing and medical imaging. The difficulties encountered by classical registration approaches include feature design and slow optimization by gradient descent. By analyzing these methods, we note the significance of the notion of scale. We design easy-to-train, fully-convolutional neural networks able to learn scale-specific features. Once chained appropriately, they perform global registration in linear time, getting rid of gradient descent schemes by predicting directly the deformation. We show their performance in terms of quality and speed through various tasks of remote sensing multimodal image alignment. In particular, we are able to register correctly cadastral maps of buildings as well as road polylines onto RGB images, and outperform current keypoint matching methods.

[1]  Yuanxin Ye,et al.  A local descriptor based registration method for multispectral remote sensing images with non-linear intensity differences , 2014 .

[2]  Anthony J. Yezzi,et al.  Coarse-to-Fine Segmentation and Tracking Using Sobolev Active Contours , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Lorenzo Bruzzone,et al.  Robust Registration of Multimodal Remote Sensing Images Based on Structural Similarity , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Patrick Weber,et al.  OpenStreetMap: User-Generated Street Maps , 2008, IEEE Pervasive Computing.

[7]  Florent Lafarge,et al.  Efficient Monte Carlo Sampler for Detecting Parametric Objects in Large Scenes , 2012, ECCV.

[8]  Sanja Fidler,et al.  Proximal Deep Structured Models , 2016, NIPS.

[9]  Josef Sivic,et al.  Convolutional Neural Network Architecture for Geometric Matching , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Jean-Philippe Pons,et al.  Generalized Gradients: Priors on Minimization Flows , 2007, International Journal of Computer Vision.

[11]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[12]  Michael Möller,et al.  Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[14]  D. Mumford,et al.  A Metric on Shape Space with Explicit Geodesics , 2007, 0706.4299.

[15]  Haiying Liu,et al.  A Generic Framework for Non-rigid Registration Based on Non-uniform Multi-level Free-Form Deformations , 2001, MICCAI.

[16]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[18]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[19]  Raquel Urtasun,et al.  Exploiting Deep Matching and SAR Data for the Geo-Localization Accuracy Improvement of Optical Satellite Images , 2017, Remote. Sens..

[20]  D. Kendall A Survey of the Statistical Theory of Shape , 1989 .

[21]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[22]  Sanja Fidler,et al.  HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Bernhard Schölkopf,et al.  Learning similarity measure for multi-modal 3D image registration , 2009, CVPR 2009.

[24]  Alain Trouvé,et al.  Computing Large Deformation Metric Mappings via Geodesic Flows of Diffeomorphisms , 2005, International Journal of Computer Vision.

[25]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Olivier D. Faugeras,et al.  Image statistics based on diffeomorphic matching , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Y. Ye,et al.  HOPC: A NOVEL SIMILARITY METRIC BASED ON GEOMETRIC STRUCTURAL PROPERTIES FOR MULTI-MODAL REMOTE SENSING IMAGE MATCHING , 2016 .

[28]  Dengrong Zhang,et al.  A fast and fully automatic registration approach based on point features for multi-source remote-sensing images , 2008, Comput. Geosci..

[29]  Nikos Paragios,et al.  Deformable Medical Image Registration: A Survey , 2013, IEEE Transactions on Medical Imaging.

[30]  Pierre Alliez,et al.  Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[31]  Andreas Dengel,et al.  Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks , 2017, 2019 IEEE International Conference on Image Processing (ICIP).

[32]  Olivier D. Faugeras,et al.  Variational Methods for Multimodal Image Matching , 2002, International Journal of Computer Vision.

[33]  Pierre Alliez,et al.  Recurrent Neural Networks to Correct Satellite Image Classification Maps , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[34]  C. Justice,et al.  High-Resolution Global Maps of 21st-Century Forest Cover Change , 2013, Science.

[35]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Christian Heipke,et al.  Joint 3d Estimation of Vehicles and Scene Flow , 2015 .

[37]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[38]  Max A. Viergever,et al.  Evaluation of Ridge Seeking Operators for Multimodality Medical Image Matching , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  V. Matousek,et al.  Signature verification using ART-2 neural network , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[41]  L. Younes,et al.  Diffeomorphic matching of distributions: a new approach for unlabelled point-sets and sub-manifolds matching , 2004, CVPR 2004.