Accelerating advanced mri reconstructions on gpus

Computational acceleration on graphics processing units (GPUs) can make advanced magnetic resonance imaging (MRI) reconstruction algorithms attractive in clinical settings, thereby improving the quality of MR images across a broad spectrum of applications. At present, MR imaging is often limited by high noise levels, significant imaging artifacts, and/or long data acquisition (scan) times. Advanced image reconstruction algorithms can mitigate these limitations and improve image quality by simultaneously operating on scan data acquired with arbitrary trajectories and incorporating additional information such as anatomical constraints. However, the improvements in image quality come at the expense of a considerable increase in computation. This paper describes the acceleration of an advanced reconstruction algorithm on NVIDIA's Quadro FX 5600. Optimizations such as register allocating the voxel data, tiling the scan data, and storing the scan data in the Quadro's constant memory dramatically reduce the reconstruction's required bandwidth to on-chip memory. The Quadro's special functional units provide substantial acceleration of the trigonometric computations in the algorithm's inner loops, and experimentally-tuned code transformations increase the reconstruction's performance by an additional 20%. The reconstruction of a 3D image with 128^3 voxels ultimately achieves 150 GFLOPS and requires less than two minutes on the Quadro, while reconstruction on a quad-core CPU is thirteen times slower. Furthermore, relative to the true image, the error exhibited by the advanced reconstruction is only 12%, while conventional reconstruction techniques incur error of 42%. In short, the acceleration afforded by the GPU greatly increases the appeal of the advanced reconstruction for clinical MRI applications.

[1]  Hugo R. Shi,et al.  Toeplitz-based iterative image reconstruction for MRI with correction for magnetic field inhomogeneity , 2005, IEEE Transactions on Signal Processing.

[2]  Pedro Trancoso,et al.  Exploring graphics processor performance for general purpose applications , 2005, 8th Euromicro Conference on Digital System Design (DSD'05).

[3]  Jeffrey A. Fessler,et al.  Nonuniform fast Fourier transforms using min-max interpolation , 2003, IEEE Trans. Signal Process..

[4]  Mark Segal,et al.  The OpenGL Graphics System: A Specification , 2004 .

[5]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[6]  Rüdiger Westermann,et al.  MR image reconstruction using the GPU , 2006, SPIE Medical Imaging.

[7]  Erwin Keeve,et al.  Fourier Volume Rendering on the GPU Using a Split-Stream-FFT , 2004, VMV.

[8]  Klaus Mueller,et al.  Rapid 3-D cone-beam reconstruction with the simultaneous algebraic reconstruction technique (SART) using 2-D texture mapping hardware , 2000, IEEE Transactions on Medical Imaging.

[9]  Masaharu Sakamoto,et al.  Parallel Implementation for 3-D CT Image Reconstruction on Cell Broadband Engine , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[10]  Tobias Schaeffter,et al.  Accelerating the Nonequispaced Fast Fourier Transform on Commodity Graphics Hardware , 2008, IEEE Transactions on Medical Imaging.

[11]  A. Macovski,et al.  Selection of a convolution function for Fourier inversion using gridding [computerised tomography application]. , 1991, IEEE transactions on medical imaging.

[12]  C. Ahn,et al.  High-Speed Spiral-Scan Echo Planar NMR Imaging-I , 1986, IEEE Transactions on Medical Imaging.

[13]  David Kirk,et al.  NVIDIA cuda software and gpu parallel computing architecture , 2007, ISMM '07.

[14]  Torsten Möller,et al.  Rapid emission tomography reconstruction , 2003, VG.

[15]  Zhi-Pei Liang,et al.  Anatomically constrained reconstruction from noisy data , 2008, Magnetic resonance in medicine.

[16]  David Tarditi,et al.  Accelerator: using data parallelism to program GPUs for general-purpose uses , 2006, ASPLOS XII.

[17]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[18]  P. Lauterbur,et al.  Image Formation by Induced Local Interactions: Examples Employing Nuclear Magnetic Resonance , 1973, Nature.

[19]  Brian Cabral,et al.  Accelerated volume rendering and tomographic reconstruction using texture mapping hardware , 1994, VVS '94.

[20]  Marc Kachelrieß,et al.  Implementation of a cone-beam backprojection algorithm on the cell broadband engine processor , 2007, SPIE Medical Imaging.

[21]  Evren Özarslan,et al.  Three‐dimensional analytical magnetic resonance imaging phantom in the Fourier domain , 2007, Magnetic resonance in medicine.

[22]  Frank T. A. W. Wajer,et al.  Non-Cartesian MRI scan time reduction through sparse sampling , 2001 .

[23]  Jan Timmer,et al.  The gridding method for image reconstruction by Fourier transformation , 1995, IEEE Trans. Medical Imaging.

[24]  Xinwei Xue,et al.  Acceleration of fluoro-CT reconstruction for a mobile C-arm on GPU and FPGA hardware: a simulation study , 2006, SPIE Medical Imaging.

[25]  Wen-mei W. Hwu,et al.  Program optimization space pruning for a multithreaded gpu , 2008, CGO '08.

[26]  Jeffrey A. Fessler,et al.  Fast, iterative image reconstruction for MRI in the presence of field inhomogeneities , 2003, IEEE Transactions on Medical Imaging.

[27]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[28]  Matt Pharr,et al.  Gpu gems 2: programming techniques for high-performance graphics and general-purpose computation , 2005 .

[29]  Klaus Mueller,et al.  Why do commodity graphics hardware boards (GPUs) work so well for acceleration of computed tomography? , 2007, Electronic Imaging.

[30]  P. Boesiger,et al.  Advances in sensitivity encoding with arbitrary k‐space trajectories , 2001, Magnetic resonance in medicine.

[31]  Zhi-Pei Liang,et al.  High-Resolution MR Metabolic Imaging , 2007, 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.