Multimodal medical image registration via common representations learning and differentiable geometric constraints

Multimodal medical image registration remains a challenging problem when strong appearance variations and imprecise alignment exist in images. Previous deep network approaches cannot handle such a high degree of variability and prohibit the use of strong geometric constraints. The authors introduce a novel deep architecture that not only produces image representations that are well-suited for this challenging task, but also leverages knowledge of the geometry constraints for robust registration. By enforcing the representations of different modalities living in a common semantic space, they obtain convolutional features tending to respond to object parts consistently across modality. This yields a unique description for all object fragments and allows the end user to know the model's decision process, whereas most existing models remain unclear and difficult to explain. By using the differentiable spatial transformer to compensate transform, they integrate geometric consensus into the cost function to enable end-to-end model optimisation which has not yet been exploited before. The authors evaluate their method on a very challenging medical image dataset. Experiments demonstrate that the proposed method provides a plausible representation and outperforms state-of-the-art approaches by a significant margin.

[1]  Lei Zhang,et al.  Cross-Domain Visual Matching via Generalized Similarity Measure and Feature Learning , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Z. Jane Wang,et al.  A CNN Regression Approach for Real-Time 2D/3D Registration , 2016, IEEE Transactions on Medical Imaging.

[3]  Miao Huang,et al.  Synthesising KV‐DRRs from MV‐DRs with fractal hourglass convolutional network , 2018, Electronics Letters.