Scalable FBP decomposition for cone-beam CT reconstruction

Filtered Back-Projection (FBP) is a fundamental compute intense algorithm used in tomographic image reconstruction. Cone-Beam Computed Tomography (CBCT) devices use a cone-shaped X-ray beam, in comparison to the parallel beam used in older CT generations. Distributed image reconstruction of cone-beam datasets typically relies on dividing batches of images into different nodes. This simple input decomposition, however, introduces limits on input/output sizes and scalability. We propose a novel decomposition scheme and reconstruction algorithm for distributed FPB. This scheme enables arbitrarily large input/output sizes, eliminates the redundancy arising in the end-to-end pipeline and improves the scalability by replacing two communication collectives with only one segmented reduction. Finally, we implement the proposed decomposition scheme in a framework that is useful for all current-generation CT devices (7th gen). In our experiments using up to 1024 GPUs, our framework can construct 40963 volumes, for real-world datasets, in under 16 seconds (including I/O).

[1]  Fumihiko Ino,et al.  Out-of-core cone beam reconstruction using multiple GPUs , 2010, 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[2]  Philippe Després,et al.  System matrix computation vs storage on GPU: A comparative study in cone beam CT , 2018, Medical physics.

[3]  Marian Kremers 2021 , 2021, Vakblad Sociaal Werk.

[4]  Craig S. Levin,et al.  Distributed MLEM: An Iterative Tomographic Image Reconstruction Algorithm for Distributed Memory Architectures , 2013, IEEE Transactions on Medical Imaging.

[5]  Eric L. Miller,et al.  Parallel-Beam Backprojection: An FPGA Implementation Optimized for Medical Imaging , 2002, FPGA '02.

[6]  Charles A. Bouman,et al.  High performance model based image reconstruction , 2016, PPoPP.

[7]  Achilleas S Frangakis,et al.  Implementation and performance evaluation of reconstruction algorithms on graphics processors. , 2007, Journal of structural biology.

[8]  Tekin Bicer,et al.  Trace: a high-throughput tomographic reconstruction engine for large-scale datasets , 2017, Advanced Structural and Chemical Imaging.

[9]  S. Hewitt,et al.  2007 , 2018, Los 25 años de la OMC: Una retrospectiva fotográfica.

[10]  Charles A. Bouman,et al.  Model-based Iterative CT Image Reconstruction on GPUs , 2017, PPOPP.

[11]  Fang Xu,et al.  Accelerating popular tomographic reconstruction algorithms on commodity PC graphics hardware , 2005, IEEE Transactions on Nuclear Science.

[12]  Heng-Li Huang,et al.  Effect of Scanning Resolution on the Prediction of Trabecular Bone Microarchitectures Using Dental Cone Beam Computed Tomography , 2020, Diagnostics.

[13]  H. Bosmans,et al.  Effective dose range for dental cone beam computed tomography scanners. , 2012, European journal of radiology.

[14]  M. Ivanovic,et al.  Comparative dosimetry of dental CBCT devices and 64-slice CT for oral and maxillofacial radiology. , 2008, Oral surgery, oral medicine, oral pathology, oral radiology, and endodontics.

[15]  Jens Gregor,et al.  Computational Analysis and Improvement of SIRT , 2008, IEEE Transactions on Medical Imaging.

[16]  Wen-mei W. Hwu,et al.  MemXCT: memory-centric X-ray CT reconstruction with massive parallelization , 2019, SC.

[17]  W. Z'Graggen,et al.  Efficiency of Iterative Metal Artifact Reduction Algorithm (iMAR) Applied to Brain Volume Perfusion CT in the Follow-up of Patients after Coiling or Clipping of Ruptured Brain Aneurysms , 2019, Scientific Reports.

[18]  Brian Cabral,et al.  Accelerated volume rendering and tomographic reconstruction using texture mapping hardware , 1994, VVS '94.

[19]  Mohamed Wahib,et al.  iFDK: a scalable framework for instant high-resolution image reconstruction , 2019, SC.

[20]  G. Swennen,et al.  Cone-beam computerized tomography (CBCT) imaging of the oral and maxillofacial region: a systematic review of the literature. , 2009, International journal of oral and maxillofacial surgery.

[21]  N. Navab,et al.  Enhanced 3-D-reconstruction algorithm for C-arm systems suitable for interventional procedures , 2000, IEEE Transactions on Medical Imaging.

[22]  Hui Zhang,et al.  Optimized Implementation of the FDK Algorithm on One Digital Signal Processor , 2010 .

[23]  Fumihiko Ino,et al.  Cache-Aware GPU Optimization for Out-of-Core Cone Beam CT Reconstruction of High-Resolution Volumes , 2016, IEICE Trans. Inf. Syst..

[24]  Jan Sijbers,et al.  Fast and flexible X-ray tomography using the ASTRA toolbox. , 2016, Optics express.

[25]  B. F. Logan,et al.  The Fourier reconstruction of a head section , 1974 .

[26]  G. M. Besson,et al.  Seventh-generation CT , 2016, SPIE Medical Imaging.

[27]  Benjamin Keck,et al.  Systematic Performance Optimization of Cone-Beam Back-Projection on the Kepler Architecture , 2013 .

[28]  M. Vannier,et al.  Why do commercial CT scanners still employ traditional, filtered back-projection for image reconstruction? , 2009, Inverse problems.

[29]  J. Kettenbach,et al.  Toward on-the-fly trajectory optimization for C-arm CBCT under strong kinematic constraints , 2021, PloS one.

[30]  J H Siewerdsen,et al.  Cone-beam computed tomography with a flat-panel imager: initial performance characterization. , 2000, Medical physics.

[31]  김영훈 2000 , 2001, The Winning Cars of the Indianapolis 500.

[32]  Charles A. Bouman,et al.  Model-based Iterative CT Image Reconstruction on GPUs , 2017, PPoPP.

[33]  Xiaochuan Pan,et al.  Exact image reconstruction on PI-lines from minimum data in helical cone-beam CT. , 2004, Physics in medicine and biology.

[34]  Vilém Novák,et al.  Advanced CT and MR Image Processing with FPGA , 2012, IDEAL.

[35]  Nassir Navab,et al.  3D Reconstruction from Projection Matrices in a C-Arm Based 3D-Angiography System , 1998, MICCAI.

[36]  Nicolai M. Josuttis The C++ Standard Library: A Tutorial and Reference , 2012 .

[37]  M. Anand “1984” , 1962 .

[38]  M. Glas,et al.  Principles of Computerized Tomographic Imaging , 2000 .

[39]  Richard Vuduc,et al.  Automatic performance tuning of sparse matrix kernels , 2003 .

[40]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[41]  Jan Sijbers,et al.  TomoBank: a tomographic data repository for computational x-ray science , 2018 .

[42]  R. Stoessel,et al.  μ-Computed Tomography for 3D Porosity Evaluation in Carbon Fibre Reinforced Plastics (CFRP) , 2011 .

[43]  Stewart Taylor,et al.  Optimizing Applications for Multi-Core Processors, Using the Intel® Integrated Performance Primitives, Second Edition , 2007 .

[44]  Charles A. Bouman,et al.  Massively Parallel 3D Image Reconstruction , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.

[45]  Ian T. Foster,et al.  Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.

[46]  L. Shepp,et al.  Maximum Likelihood Reconstruction for Emission Tomography , 1983, IEEE Transactions on Medical Imaging.

[47]  Milan Sonka,et al.  3D Slicer as an image computing platform for the Quantitative Imaging Network. , 2012, Magnetic resonance imaging.

[48]  F. Strong Theoretical Basis of Bouguer-Beer Law of Radiation Absorption , 1952 .

[49]  N. Subramanian A Cto-FPGA Solution for Accelerating Tomographic Reconstruction , 2009 .

[50]  Jesús Carretero,et al.  Surfing the optimization space of a multiple-GPU parallel implementation of a X-ray tomography reconstruction algorithm , 2014, J. Syst. Softw..

[51]  E.E. Pissaloux,et al.  Image Processing , 1994, Proceedings. Second Euromicro Workshop on Parallel and Distributed Processing.

[52]  Jeffrey H. Siewerdsen,et al.  SU-FF-I-16: OSCaR: An Open-Source Cone-Beam CT Reconstruction Tool for Imaging Research , 2007 .

[53]  M. A. Wu,et al.  ASIC applications in computed tomography systems , 1991, [1991] Proceedings Fourth Annual IEEE International ASIC Conference and Exhibit.

[54]  Fred L. Collopy,et al.  Error Measures for Generalizing About Forecasting Methods: Empirical Comparisons , 1992 .

[55]  Philip J. Withers,et al.  Towards in-process x-ray CT for dimensional metrology , 2016 .

[56]  Charles A. Bouman,et al.  Separable Models for cone-beam MBIR Reconstruction , 2018, Computational Imaging.

[57]  Joachim Hornegger,et al.  Evaluation of state-of-the-art hardware architectures for fast cone-beam CT reconstruction , 2011, Parallel Comput..

[58]  Robert Schmitt,et al.  Computed tomography for dimensional metrology , 2011 .

[59]  Nassir Navab,et al.  Enhanced 3D-reconstruction algorithms for C-Arm based interventional procedures , 2000, IEEE Trans. Medical Imaging.

[60]  Kai Yang,et al.  A geometric calibration method for cone beam CT systems. , 2006, Medical physics.

[61]  Jay Wu,et al.  Improving the prediction of the trabecular bone microarchitectural parameters using dental cone-beam computed tomography , 2019, BMC Medical Imaging.

[62]  Richard P. Boardman,et al.  Arbitrarily large tomography with iterative algorithms on multiple GPUs using the TIGRE toolbox , 2020, J. Parallel Distributed Comput..

[63]  William Gropp,et al.  A Scalable MPI_Comm_split Algorithm for Exascale Computing , 2010, EuroMPI.

[64]  Mark A. Knackstedt,et al.  Imaging of metallic foams using X-ray micro-CT , 2009 .

[65]  Anders Eklund,et al.  Medical image processing on the GPU - Past, present and future , 2013, Medical Image Anal..

[66]  Gerhard Wellein,et al.  Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator , 2013, ARCS Workshops.

[67]  Los Angeles,et al.  An FPGA Architecture for Real-Time 3-D Tomographic Reconstruction , 2012 .

[68]  Xinwei Xue,et al.  Acceleration of fluoro-CT reconstruction for a mobile C-arm on GPU and FPGA hardware: a simulation study , 2006, SPIE Medical Imaging.

[69]  Thomas Blumensath,et al.  Block stochastic gradient descent for large-scale tomographic reconstruction in a parallel network , 2019, ArXiv.

[70]  Theobald Fuchs,et al.  X-ray based methods for non-destructive testing and material characterization , 2008 .

[71]  Willem Jan Palenstijn,et al.  A distributed SIRT implementation for the ASTRA Toolbox , 2015 .

[72]  L. Feldkamp,et al.  Practical cone-beam algorithm , 1984 .