Accelerating cone beam reconstruction using the CUDA-enabled GPU

Compute unified device architecture (CUDA) is a software developmentplatform that enables us to write and run general-purpose applications onthe graphics processing unit (GPU). This paper presents a fast method for conebeam reconstruction using the CUDA-enabled GPU. The proposed method is acceleratedby two techniques: (1) off-chip memory access reduction; and (2) memorylatency hiding. We describe how these techniques can be incorporated intoCUDA code. Experimental results show that the proposed method runs at 82%of the peak memory bandwidth, taking 5.6 seconds to reconstruct a 5123-voxelvolume from 360 5122-pixel projections. This performance is 18% faster thanthe prior method. Some detailed analyses are also presented to understand howeffectively the acceleration techniques increase the reconstruction performanceof a naive method.