High performance computing for deformable image registration: Towards a new paradigm in adaptive radiotherapy.

The advent of readily available temporal imaging or time series volumetric (4D) imaging has become an indispensable component of treatment planning and adaptive radiotherapy (ART) at many radiotherapy centers. Deformable image registration (DIR) is also used in other areas of medical imaging, including motion corrected image reconstruction. Due to long computation time, clinical applications of DIR in radiation therapy and elsewhere have been limited and consequently relegated to offline analysis. With the recent advances in hardware and software, graphics processing unit (GPU) based computing is an emerging technology for general purpose computation, including DIR, and is suitable for highly parallelized computing. However, traditional general purpose computation on the GPU is limited because the constraints of the available programming platforms. As well, compared to CPU programming, the GPU currently has reduced dedicated processor memory, which can limit the useful working data set for parallelized processing. We present an implementation of the demons algorithm using the NVIDIA 8800 GTX GPU and the new CUDA programming language. The GPU performance will be compared with single threading and multithreading CPU implementations on an Intel dual core 2.4GHz CPU using the C programming language. CUDA provides a C-like language programming interface, and allows for direct access to the highly parallel compute units in the GPU. Comparisons for volumetric clinical lung images acquired using 4DCT were carried out. Computation time for 100 iterations in the range of 1.8-13.5s was observed for the GPU with image size ranging from 2.0×106to14.2×106pixels. The GPU registration was 55-61 times faster than the CPU for the single threading implementation, and 34-39 times faster for the multithreading implementation. For CPU based computing, the computational time generally has a linear dependence on image size for medical imaging data. Computational efficiency is characterized in terms of time per megapixels per iteration (TPMI) with units of seconds per megapixels per iteration (or spmi). For the demons algorithm, our CPU implementation yielded largely invariant values of TPMI. The mean TPMIs were 0.527spmi and 0.335spmi for the single threading and multithreading cases, respectively, with <2% variation over the considered image data range. For GPU computing, we achieved TPMI=0.00916 spmi with 3.7% variation, indicating optimized memory handling under CUDA. The paradigm of GPU based real-time DIR opens up a host of clinical applications for medical imaging.

[1]  Patrick A Kupelian,et al.  Influence of intrafraction motion on margins for prostate radiotherapy. , 2006, International journal of radiation oncology, biology, physics.

[2]  H. Alasti,et al.  Investigation of the dosimetric effect of respiratory motion using four-dimensional weighted radiotherapy , 2007, Physics in medicine and biology.

[3]  Suresh Senan,et al.  Verifying 4D gated radiotherapy using time-integrated electronic portal imaging: a phantom and clinical study , 2007, Radiation oncology.

[4]  Martin Rumpf,et al.  Image Registration by a Regularized Gradient Flow. A Streaming Implementation in DX9 Graphics Hardware , 2004, Computing.

[5]  Quan Chen,et al.  Automatic re-contouring in 4D radiotherapy , 2006, Physics in medicine and biology.

[6]  Iain Goddard,et al.  Implementation of a spiral CT backprojection algorithm on the cell broadband engine processor , 2006, SPIE Medical Imaging.

[7]  Lei Dong,et al.  Reduce in variation and improve efficiency of target volume delineation by a computer-assisted system using a deformable image registration approach. , 2007, International journal of radiation oncology, biology, physics.

[8]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[9]  Mathieu De Craene,et al.  Tumour delineation and cumulative dose computation in radiotherapy based on deformable registration of respiratory correlated CT images of lung cancer patients. , 2007, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[10]  George T. Y. Chen,et al.  Four-dimensional image-based treatment planning: Target volume segmentation and dose calculation in the presence of respiratory motion. , 2005, International journal of radiation oncology, biology, physics.

[11]  Gábor Székely,et al.  Systematic errors in respiratory gating due to intrafraction deformations of the liver. , 2007, Medical physics.

[12]  Joe Y. Chang,et al.  Validation of an accelerated ‘demons’ algorithm for deformable image registration in radiation therapy , 2005, Physics in medicine and biology.

[13]  O. Dandekar,et al.  Hardware Implementation of Hierarchical Volume Subdivision-based Elastic Registration , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[14]  L Xing,et al.  Motion correction for improved target localization with on-board cone-beam computed tomography , 2006, Physics in medicine and biology.

[15]  Yunmei Chen,et al.  The Juggler algorithm: a hybrid deformable image registration algorithm for adaptive radiotherapy , 2007, SPIE Medical Imaging.

[16]  G C Sharp,et al.  GPU-based streaming architectures for fast cone-beam CT image reconstruction and demons deformable registration , 2007, Physics in medicine and biology.

[17]  Carlos R. Castro-Pareja,et al.  FPGA-based acceleration of mutual information calculation for real-time 3D image registration , 2004, IS&T/SPIE Electronic Imaging.

[18]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .

[19]  Ronald H. Huesman,et al.  Elastic material model mismatch effects in deformable motion estimation , 1999 .

[20]  T. Mackie,et al.  Fast free-form deformable registration via calculus of variations , 2004, Physics in medicine and biology.

[21]  Wei Lu,et al.  Assessment of intrafraction mediastinal and hilar lymph node movement and comparison to lung tumor motion using four-dimensional CT. , 2007, International journal of radiation oncology, biology, physics.

[22]  David R. Gilland,et al.  Estimation of images and nonrigid deformations in gated emission CT , 2006, IEEE Transactions on Medical Imaging.

[23]  Jean-Philippe Thirion,et al.  Image matching as a diffusion process: an analogy with Maxwell's demons , 1998, Medical Image Anal..

[24]  J. McClelland,et al.  A continuous 4D motion model from multiple respiratory cycles for use in lung radiotherapy. , 2006, Medical physics.

[25]  Randima Fernando,et al.  The CG Tutorial: The Definitive Guide to Programmable Real-Time Graphics , 2003 .

[26]  Li Zhang,et al.  Defining internal target volume (ITV) for hepatocellular carcinoma using four-dimensional CT. , 2007, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[27]  Radhe Mohan,et al.  Four-dimensional radiotherapy planning for DMLC-based respiratory motion tracking. , 2005, Medical physics.