Efficient Acceleration of Mutual Information Computation for Nonrigid Registration Using CUDA

In this paper, we propose an efficient acceleration method for the nonrigid registration of multimodal images that uses a graphics processing unit. The key contribution of our method is efficient utilization of on-chip memory for both normalized mutual information (NMI) computation and hierarchical B-spline deformation, which compose a well-known registration algorithm. We implement this registration algorithm as a compute unified device architecture program with an efficient parallel scheme and several optimization techniques such as hierarchical data organization, data reuse, and multiresolution representation. We experimentally evaluate our method with four clinical datasets consisting of up to 512 × 512 × 296 voxels. We find that exploitation of on-chip memory achieves a 12-fold increase in speed over an off-chip memory version and, therefore, it increases the efficiency of parallel execution from 4% to 46%. We also find that our method running on a GeForce GTX 580 card is approximately 14 times faster than a fully optimized CPU-based implementation running on four cores. Some multimodal registration results are also provided to understand the limitation of our method. We believe that our highly efficient method, which completes an alignment task within a few tens of seconds, will be useful to realize rapid nonrigid registration.

[1]  Pheng-Ann Heng,et al.  CUDA-based acceleration and algorithm refinement for volume image registration , 2009, 2009 International Conference on Future BioMedical Information Engineering (FBIE).

[2]  D. Stevenson A Proposed Standard for Binary Floating-Point Arithmetic , 1981, Computer.

[3]  Nobuhiko Hata,et al.  MRI signal intensity based B‐Spline nonrigid registration for pre‐ and intraoperative imaging during prostate brachytherapy , 2009, Journal of magnetic resonance imaging : JMRI.

[4]  Rodney A. Kennedy,et al.  Efficient Histogram Algorithms for NVIDIA CUDA Compatible Devices , 2007 .

[5]  Ron Kikinis,et al.  3D Slicer , 2012, 2004 2nd IEEE International Symposium on Biomedical Imaging: Nano to Macro (IEEE Cat No. 04EX821).

[6]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[7]  Colin Studholme,et al.  Accurate alignment of functional EPI data to anatomical MRI using a physics-based distortion model , 2000, IEEE Transactions on Medical Imaging.

[8]  Hervé Delingette,et al.  Robust nonrigid registration to capture brain shift from intraoperative MRI , 2005, IEEE Transactions on Medical Imaging.

[9]  Rohit Chandra,et al.  Parallel programming in openMP , 2000 .

[10]  Torsten Rohlfing,et al.  Nonrigid image registration in shared-memory multiprocessor environments with application to brains, breasts, and bees , 2003, IEEE Transactions on Information Technology in Biomedicine.

[11]  S.S. Bhattacharyya,et al.  Towards systematic exploration of tradeoffs for medical image registration on heterogeneous platforms , 2008, 2008 IEEE Biomedical Circuits and Systems Conference.

[12]  Rodney A. Kennedy,et al.  A Survey of Medical Image Registration on Multicore and the GPU , 2010, IEEE Signal Processing Magazine.

[13]  Fumihiko Ino,et al.  High-performance cone beam reconstruction using CUDA compatible GPUs , 2010, Parallel Comput..

[14]  Dimitris N. Metaxas,et al.  Open science - combining open data and open source software: Medical image analysis with the Insight Toolkit , 2005, Medical Image Anal..

[15]  Ron Kikinis,et al.  Non-Rigid Registration for brain MRI: faster and cheaper , 2010, Int. J. Funct. Informatics Pers. Medicine.

[16]  Sung Yong Shin,et al.  Scattered Data Interpolation with Multilevel B-Splines , 1997, IEEE Trans. Vis. Comput. Graph..

[17]  Yoshinobu Sato,et al.  A similarity measure for nonrigid volume registration using known joint distribution of targeted tissue: Application to dynamic CT data of the liver , 2003, Medical Image Anal..

[18]  Antonio Ruiz,et al.  Non-rigid Registration for Large Sets of Microscopic Images on Graphics Processors , 2009, J. Signal Process. Syst..

[19]  Leiguang Gong,et al.  Accelerating 3D nonrigid registration using the Cell Broadband Engine processor , 2009, IBM J. Res. Dev..

[20]  Cornelis H. Slump,et al.  MRI modalitiy transformation in demon registration , 2009, 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[21]  Xiao Han,et al.  GPU-accelerated, gradient-free MI deformable registration for atlas-based MR brain image segmentation , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[22]  J.M. Jagadeesh,et al.  FAIR: a hardware architecture for real-time 3-D image registration , 2003, IEEE Transactions on Information Technology in Biomedicine.

[23]  Kemal Tuncali,et al.  Multimodality non-rigid image registration for planning, targeting and monitoring during CT-guided percutaneous liver tumor cryoablation. , 2010, Academic radiology.

[24]  A. Ben Hamza,et al.  An information-theoretic method for multimodality medical image registration , 2012, Expert Syst. Appl..

[25]  F. Jolesz,et al.  1996 RSNA Eugene P. Pendergrass New Horizons Lecture. Image-guided procedures and the operating room of the future. , 1997, Radiology.

[26]  Fumihiko Ino,et al.  A data distributed parallel algorithm for nonrigid image registration , 2005, Parallel Comput..

[27]  Daniel Rueckert,et al.  Nonrigid registration using free-form deformations: application to breast MR images , 1999, IEEE Transactions on Medical Imaging.

[28]  Colin Studholme,et al.  An overlap invariant entropy measure of 3D medical image alignment , 1999, Pattern Recognit..

[29]  Rodney A. Kennedy,et al.  Parallel computation of mutual information on the GPU with application to real-time registration of 3D medical images , 2010, Comput. Methods Programs Biomed..

[30]  Raj Shekhar,et al.  FPGA-Accelerated Deformable Image Registration for Improved Target-Delineation During CT-Guided Interventions , 2007, IEEE Transactions on Biomedical Circuits and Systems.

[31]  Fumihiko Ino,et al.  Sequence Homology Search Using Fine Grained Cycle Sharing of Idle GPUs , 2012, IEEE Transactions on Parallel and Distributed Systems.

[32]  Junyi Xia,et al.  High performance computing for deformable image registration: towards a new paradigm in adaptive radiotherapy. , 2008, Medical physics.

[33]  R. Kikinis,et al.  Toward Real-Time Image Guided Neurosurgery Using Distributed and Grid Computing , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[34]  Sébastien Ourselin,et al.  Fast free-form deformation using graphics processing units , 2010, Comput. Methods Programs Biomed..

[35]  Leiguang Gong,et al.  A Parallel GPU Algorithm for Mutual Information Based 3D Nonrigid Image Registration , 2010, Euro-Par.

[36]  Rüdiger Westermann,et al.  Optimized GPU histograms for multi-modal registration , 2011, 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[37]  Terry M. Peters,et al.  High-performance medical image registration using new optimization techniques , 2006, IEEE Transactions on Information Technology in Biomedicine.

[38]  Cheng-Chang Lu,et al.  Acceleration of Medical Image Registration Using Graphics Process Units in Computing Normalized Mutual Information , 2009, 2009 Fifth International Conference on Image and Graphics.

[39]  Shuvra S. Bhattacharyya,et al.  Utilizing Hierarchical Multiprocessing for Medical Image Registration , 2010, IEEE Signal Processing Magazine.

[40]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.