ICON: Learning Regular Maps Through Inverse Consistency

Learning maps between data samples is fundamental. Applications range from representation learning, image translation and generative modeling, to the estimation of spatial deformations. Such maps relate feature vectors, or map between feature spaces. Well-behaved maps should be regular, which can be imposed explicitly or may emanate from the data itself. We explore what induces regularity for spatial transformations, e.g., when computing image registrations. Classical optimization-based models compute maps between pairs of samples and rely on an appropriate regularizer for well-posedness. Recent deep learning approaches have attempted to avoid using such regularizers altogether by relying on the sample population instead. We explore if it is possible to obtain spatial regularity using an inverse consistency loss only and elucidate what explains map regularity in such a context. We find that deep networks combined with an inverse consistency loss and randomized off-grid interpolation yield well behaved, approximately diffeomorphic, spatial transformations. Despite the simplicity of this approach, our experiments present compelling evidence, on both synthetic and real data, that regular maps can be obtained without carefully tuned explicit regularizers, while achieving competitive registration performance. 1. Motivation Learning maps between feature vectors or spaces is an important task. Feature vector maps are used to improve representation learning [7], or to learn correspondences in natural language processing [4]. Maps between spaces are important for generative models when using normalizing flows [24] (to map between a simple and a complex probability distribution), or to determine spatial correspondences between images, e.g., for optical flow [16] to determine motion from videos [12], depth estimation from stereo images [25], or medical image registration [39, 40]. Regular maps are typically desired; e.g., diffeomorphic maps for normalizing flows to properly map densities, or for medical image registration to map to an atlas space [20]. Estimating such maps requires an appropriate choice of transformation model. This entails picking a parameterization, which can be simple and depend on few parameters (e.g., an affine transformation), or which can have millions of parameters for 3D nonparametric approaches [14]. Regularity is achieved by 1) picking a simple transformation model with limited degrees of freedom, 2) regularization of the transformation parameters, 3) or implicitly through the data itself. Our goal is to demonstrate and understand how spatial regularity of a transformation can be achieved by encouraging inverse consistency of a map. Our motivating example is

[1]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Jong Chul Ye,et al.  CycleMorph: Cycle Consistent Unsupervised Deformable Image Registration , 2020, Medical Image Anal..

[3]  L. Pascale,et al.  The Monge problem with vanishing gradient penalization: Vortices and asymptotic profile , 2014, 1407.7022.

[4]  Prateek Jain,et al.  The Pitfalls of Simplicity Bias in Neural Networks , 2020, NeurIPS.

[5]  Mert R. Sabuncu,et al.  VoxelMorph: A Learning Framework for Deformable Medical Image Registration , 2018, IEEE Transactions on Medical Imaging.

[6]  Ivan Kobyzev,et al.  Normalizing Flows: An Introduction and Review of Current Methods , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Mert R. Sabuncu,et al.  Unsupervised Learning for Fast Probabilistic Diffeomorphic Registration , 2018, MICCAI.

[8]  Marc Niethammer,et al.  Region-specific Diffeomorphic Metric Mapping , 2019, NeurIPS.

[9]  Xu Han,et al.  Networks for Joint Affine and Non-Parametric Image Registration , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  W. Eric L. Grimson,et al.  Efficient Population Registration of 3D Data , 2005, CVBIA.

[13]  Stefan Roth,et al.  Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[15]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[16]  P. Pérez,et al.  SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  François-Xavier Vialard,et al.  Metric Learning for Image Registration , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  D. Mumford,et al.  VANISHING GEODESIC DISTANCE ON SPACES OF SUBMANIFOLDS AND DIFFEOMORPHISMS , 2004, math/0409303.

[19]  A neural sorting network with O(1) time complexity , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[20]  Mark Holden,et al.  A Review of Geometric Transformations for Nonrigid Body Registration , 2008, IEEE Transactions on Medical Imaging.

[21]  Olivier Teboul,et al.  Fast Differentiable Sorting and Ranking , 2020, ICML.

[22]  Iasonas Kokkinos,et al.  Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance , 2018, ECCV.

[23]  Lu Yuan,et al.  Cross-Domain Correspondence Learning for Exemplar-Based Image Translation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Daniel Rueckert,et al.  Simultaneous Multi-scale Registration Using Large Deformation Diffeomorphic Metric Mapping , 2011, IEEE Transactions on Medical Imaging.

[25]  Patrick Bouthemy,et al.  Optical flow modeling and computation: A survey , 2015, Comput. Vis. Image Underst..

[26]  Xiao Yang,et al.  Fast Predictive Image Registration , 2016, LABELS/DLMIA@MICCAI.

[27]  Ross T. Whitaker,et al.  A Cooperative Autoencoder for Population-Based Regularization of CNN Image Registration , 2019, MICCAI.

[28]  Wen-Tsuen Chen,et al.  A neural sorting network with O(1) time complexity , 1990, IJCNN.

[29]  Jong Chul Ye,et al.  Unsupervised Deformable Image Registration Using Cycle-Consistent CNN , 2019, MICCAI.

[30]  Gary E. Christensen,et al.  Consistent image registration , 2001, IEEE Transactions on Medical Imaging.

[31]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Marc Niethammer,et al.  Quicksilver: Fast predictive image registration – A deep learning approach , 2017, NeuroImage.

[33]  Guido Gerig,et al.  Unbiased diffeomorphic atlas construction for computational anatomy , 2004, NeuroImage.

[34]  Jan Modersitzki,et al.  Numerical Methods for Image Registration , 2004 .

[35]  Marco Cuturi,et al.  Computational Optimal Transport: With Applications to Data Science , 2019 .

[36]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[37]  Torsten Rohlfing,et al.  Image Similarity and Tissue Overlaps as Surrogates for Image Registration Accuracy: Widely Used but Unreliable , 2012, IEEE Transactions on Medical Imaging.

[38]  Stefan Zachow,et al.  Automated segmentation of knee bone and cartilage combining statistical shape knowledge and convolutional neural networks: Data from the Osteoarthritis Initiative , 2019, Medical Image Anal..

[39]  Jun Zhang,et al.  Inverse-Consistent Deep Networks for Unsupervised Deformable Image Registration , 2018, ArXiv.

[40]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  James C Gee,et al.  Learning image-based spatial transformations via convolutional neural networks: A review. , 2019, Magnetic resonance imaging.

[42]  Jerrold E. Marsden,et al.  Averaged Template Matching Equations , 2001, EMMCVPR.

[43]  Jascha Sohl-Dickstein,et al.  Neural reparameterization improves structural optimization , 2019, ArXiv.

[44]  Mohammed Bennamoun,et al.  A Survey on Deep Learning Techniques for Stereo-Based Depth Estimation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Xiaoou Tang,et al.  LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Marc Niethammer,et al.  An optimal control approach for deformable registration , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[47]  Nikos Paragios,et al.  Deformable Medical Image Registration: A Survey , 2013, IEEE Transactions on Medical Imaging.