Balancing Load of GPU Subsystems to Accelerate Image Reconstruction in Parallel Beam Tomography

Synchrotron X-ray imaging is a powerful method to investigate internal structures down to the micro and nanoscopic scale. Fast cameras recording thousands of frames per second allow time-resolved studies with a high temporal resolution. Fast image reconstruction is essential to provide the synchrotron instrumentation with the imaging information required to track and control the process under study. Traditionally Filtered Back Projection algorithm is used for tomographic reconstruction. In this article, we discuss how to implement the algorithm on nowadays GPGPU architectures efficiently. The key is to achieve balanced utilization of available GPU subsystems. We present two highly optimized algorithms to perform back projection on parallel hardware. One is relying on the texture engine to perform reconstruction, while another one utilizes the Core computational units of the GPU. Both methods outperform current state-of-the-art techniques found in the standard reconstructions codes significantly. Finally, we propose a hybrid approach combining both algorithms to better balance load between G PU subsystems. It further boosts the performance by about 30 % on NVIDIA Pascal micro-architecture.

[1]  Klaus Mueller,et al.  Rapid rabbit: Highly optimized GPU accelerated cone-beam CT reconstruction , 2013, 2013 IEEE Nuclear Science Symposium and Medical Imaging Conference (2013 NSS/MIC).

[2]  Jan Sijbers,et al.  Fast and flexible X-ray tomography using the ASTRA toolbox. , 2016, Optics express.

[3]  Tomy dos Santos Rolo,et al.  In vivo X-ray cine-tomography for tracking morphological dynamics , 2014, Proceedings of the National Academy of Sciences.

[4]  Kees Joost Batenburg,et al.  An Iterative CT Reconstruction Algorithm for Fast Fluid Flow Imaging , 2015, IEEE Transactions on Image Processing.

[5]  Leonardo Sala,et al.  Towards on-the-fly data post-processing for real-time tomographic imaging at TOMCAT , 2017, Advanced Structural and Chemical Imaging.

[6]  Emmanuel Brun,et al.  PyHST2: an hybrid distributed code for high speed tomographic reconstruction with iterative reconstruction and a priori knowledge capabilities , 2013, ArXiv.

[7]  Holger G. Krapp,et al.  Four-dimensional in vivo X-ray microscopy with projection-guided gating , 2015, Scientific Reports.

[8]  Andreas Kopmann,et al.  UFO: A Scalable GPU-based Image Processing Framework for On-line Monitoring , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[9]  Matthias Vogelgesang,et al.  Real-time image-content-based beamline control for smart 4D X-ray imaging. , 2016, Journal of synchrotron radiation.

[10]  M. Stampanoni,et al.  Regridding reconstruction algorithm for real-time tomographic imaging , 2012, Journal of synchrotron radiation.

[11]  Marcus Carlsson,et al.  Fast Algorithms and Efficient GPU Implementations for the Radon Transform and the Back-Projection Operator Represented as Convolution Operators , 2015, SIAM J. Imaging Sci..

[12]  Andreas Mortensen,et al.  20 Hz X-ray tomography during an in situ tensile test , 2016, International Journal of Fracture.

[13]  Emmanuel Brun,et al.  A Dictionary Learning Approach with Overlap for the Low Dose Computed Tomography Reconstruction and Its Vectorial Application to Differential Phase Tomography , 2014, PloS one.

[14]  B. F. Logan,et al.  The Fourier reconstruction of a head section , 1974 .

[15]  Benjamin Keck,et al.  Systematic Performance Optimization of Cone-Beam Back-Projection on the Kepler Architecture , 2013 .