TH‐D‐L100J‐06: Streaming Architectures for Cone‐Beam CT Image Reconstruction and Deformable Registration

Purpose: To develop data‐parallel algorithms for tomographic reconstruction and deformable registration within a stream‐processing paradigm, and execute them on cheap stream processors such as graphics processing units (GPUs). Method and Materials: In the computer sciences, the stream‐processing paradigm is emerging as a cost‐effective way of solving large‐scale parallel computing problems. This is due to the recent introduction of high‐performance stream‐processing hardware such as the Cell processor and GPUs—both are commodity stream (or vector) processors, designed specifically to support large‐scale parallel computing on a single chip. This presentation describes how to use the stream‐processing model to significantly accelerate the complex problems of data reconstruction and fusion for radiotherapy. Highly data‐parallel models were developed for: (1) the Feldkamp, Davis, and Kress (FDK) reconstruction algorithm, and (2) Demon's algorithm for optical‐flow based deformable registration. These models were implemented within the Brook programming environment and executed on an nVIDIA 8800 GPU. Results: The performance of GPU‐based implementations of the FDK and Demon's algorithms was analyzed using data obtained from an IGRT testbed of a preserved swine lung. The results indicate a substantial speedup of up to 17 times for FDK, and up to 33 times for Demons, when compared with a 2.4 GHz Intel Duo‐Core processor. In addition, the GPU was found to be capable of high‐quality reconstructions, with differences within a few Hounsfield unit. Conclusions: Results indicate that data‐parallel image‐processing algorithms, when properly designed and executed on a GPU, achieve significant speedup when compared to high‐end desktop CPUs. The acceleration of key data‐processing stages will decrease the time needed to perform image‐guided patient positioning and analysis.