Orders-of-magnitude performance increases in GPU-accelerated correlation of images from the International Space Station

We implement image correlation, a fundamental component of many real-time imaging and tracking systems, on a graphics processing unit (GPU) using NVIDIA’s CUDA platform. We use our code to analyze images of liquid-gas phase separation in a model colloid-polymer system, photographed in the absence of gravity aboard the International Space Station (ISS). Our GPU code is 4,000 times faster than simple MATLAB code performing the same calculation on a central processing unit (CPU), 130 times faster than simple C code, and 30 times faster than optimized C++ code using single-instruction, multiple-data (SIMD) extensions. The speed increases from these parallel algorithms enable us to analyze images downlinked from the ISS in a rapid fashion and send feedback to astronauts on orbit while the experiments are still being run.

[1]  Hiroshi Furukawa,et al.  A dynamic scaling assumption for phase separation , 1985 .

[2]  Jan K. G. Dhont,et al.  Pretransitional Phenomena of a Colloid Polymer Mixture Studied with Static and Dynamic Light Scattering , 1996 .

[3]  Wibren D. Oosterbaan,et al.  Indirect determination of the composition of the coexisting phases in a demixed colloid polymer mixture , 1997 .

[4]  Randima Fernando,et al.  The CG Tutorial: The Definitive Guide to Programmable Real-Time Graphics , 2003 .

[5]  Aart Johannes Casimir Bik The software vectorization handbook , 2004 .

[6]  Aart J. C. Bik Software Vectorization Handbook, The: Applying Intel Multimedia Extensions for Maximum Performance , 2004 .

[7]  Michael D. McCool,et al.  Metaprogramming GPUs with Sh , 2004 .

[8]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[9]  Jeff Schewe,et al.  Real World Camera Raw with Adobe Photoshop CS5 , 2005 .

[10]  Mark Christiansen Adobe After Effects 7.0 Studio Techniques , 2005 .

[11]  Guillermo Rein,et al.  44th AIAA Aerospace Sciences Meeting and Exhibit , 2006 .

[12]  David A Weitz,et al.  Fluids of clusters in attractive colloids. , 2006, Physical review letters.

[13]  Simon Portegies Zwart,et al.  High-performance direct gravitational N-body simulations on graphics processing units , 2007, astro-ph/0702058.

[14]  David A Weitz,et al.  Target-locking acquisition with real-time confocal (TARC) microscopy. , 2007, Optics express.

[15]  Hubert Nguyen,et al.  GPU Gems 3 , 2007 .

[16]  D A Weitz,et al.  Spinodal decomposition in a model colloid-polymer mixture in microgravity. , 2007, Physical review letters.

[17]  Golden G. Richard,et al.  Massive threading: Using GPUs to increase the performance of digital forensics tools , 2007, Digit. Investig..

[18]  Klaus Schulten,et al.  Accelerating Molecular Modeling Applications with GPU Computing , 2009 .

[19]  Amitabh Varshney,et al.  High-throughput sequence alignment using Graphics Processing Units , 2007, BMC Bioinformatics.

[20]  Nail A. Gumerov,et al.  Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU , 2008, J. Parallel Distributed Comput..

[21]  Kevin Skadron,et al.  A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..

[22]  P. J. Lu,et al.  Gelation and phase separation of attractive colloids , 2008 .

[23]  Tomoyoshi Ito,et al.  Real-time digital holographic microscopy using the graphic processing unit. , 2008, Optics express.

[24]  Sabine Pruggnaller,et al.  Performance evaluation of image processing algorithms on the GPU. , 2008, Journal of structural biology.

[25]  Ramani Duraiswami,et al.  Fast multipole methods on graphics processors , 2008, J. Comput. Phys..

[26]  D. Weitz,et al.  Gelation of particles with short-range attraction , 2008, Nature.

[27]  Yao Zhang,et al.  Parallel Computing Experiences with CUDA , 2008, IEEE Micro.

[28]  Qingming Luo,et al.  Fast blood flow visualization of high-resolution laser speckle imaging data using graphics processing unit. , 2008, Optics express.

[29]  A. Arnold,et al.  Harvesting graphics power for MD simulations , 2007, 0709.3225.

[30]  Lister Staveley-Smith,et al.  GPU accelerated radio astronomy signal convolution , 2008 .

[31]  Ulf Assarsson,et al.  Fast parallel GPU-sorting using a hybrid algorithm , 2008, J. Parallel Distributed Comput..

[32]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[33]  Yasuyuki Ichihashi,et al.  Numerical calculation library for diffraction integrals using the graphic processing unit: The GPU-based wave optics library , 2008 .

[34]  Justin P. Haldar,et al.  Accelerating advanced mri reconstructions on gpus , 2008, CF '08.

[35]  Fabio Ciulla,et al.  Gelation as arrested phase separation in short-ranged attractive colloid–polymer mixtures , 2008, 0810.4239.

[36]  Khaled Z. Ibrahim,et al.  Fine-grained parallelization of lattice QCD kernel routine on GPUs , 2008, J. Parallel Distributed Comput..

[37]  Helmar Burkhart,et al.  Algorithmic performance studies on graphics processing units , 2008, J. Parallel Distributed Comput..

[38]  Chee Keong Kwoh,et al.  CBESW: Sequence Alignment on the Playstation 3 , 2008, BMC Bioinformatics.

[39]  Weiguo Liu,et al.  Accelerating molecular dynamics simulations using Graphics Processing Units with CUDA , 2008, Comput. Phys. Commun..

[40]  Junyi Xia,et al.  High performance computing for deformable image registration: towards a new paradigm in adaptive radiotherapy. , 2008, Medical physics.

[41]  Tomas Svensson,et al.  Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration. , 2008, Journal of biomedical optics.

[42]  Joshua A. Anderson,et al.  General purpose molecular dynamics simulations fully implemented on graphics processing units , 2008, J. Comput. Phys..

[43]  Inanc Senocak,et al.  CUDA Implementation of a Navier-Stokes Solver on Multi-GPU Desktop Platforms for Incompressible Flows , 2009 .

[44]  Robert J. Brunner,et al.  Accelerating cosmological data analysis with graphics processors , 2009, GPGPU-2.

[45]  Antonio Ruiz,et al.  Non-rigid Registration for Large Sets of Microscopic Images on Graphics Processors , 2009, J. Signal Process. Syst..

[46]  Kazuhiro Otsuka,et al.  Real-time Visual Tracker by Stream Processing , 2009, J. Signal Process. Syst..

[47]  Hong Li,et al.  Parallel simulation for a fish schooling model on a general‐purpose graphics processing unit , 2009, Concurr. Comput. Pract. Exp..