GPU accelerated simulations of 3D deterministic particle transport using discrete ordinates method

Graphics Processing Unit (GPU), originally developed for real-time, high-definition 3D graphics in computer games, now provides great faculty in solving scientific applications. The basis of particle transport simulation is the time-dependent, multi-group, inhomogeneous Boltzmann transport equation. The numerical solution to the Boltzmann equation involves the discrete ordinates (S"n) method and the procedure of source iteration. In this paper, we present a GPU accelerated simulation of one energy group time-independent deterministic discrete ordinates particle transport in 3D Cartesian geometry (Sweep3D). The performance of the GPU simulations are reported with the simulations of vacuum boundary condition. The discussion of the relative advantages and disadvantages of the GPU implementation, the simulation on multi GPUs, the programming effort and code portability are also reported. The results show that the overall performance speedup of one NVIDIA Tesla M2050 GPU ranges from 2.56 compared with one Intel Xeon X5670 chip to 8.14 compared with one Intel Core Q6600 chip for no flux fixup. The simulation with flux fixup on one M2050 is 1.23 times faster than on one X5670.

[1]  William J. Rider,et al.  On consistent time-integration methods for radiation hydrodynamics in the equilibrium diffusion limit: low-energy-density regime , 2001 .

[2]  William F. Godoy,et al.  On the use of flux limiters in the discrete ordinates method for 3D radiation calculations in absorbing and scattering media , 2010, J. Comput. Phys..

[3]  Adolfy Hoisie,et al.  Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications , 2000, Int. J. High Perform. Comput. Appl..

[4]  Jing Xie,et al.  Optimizing Sweep3D for Graphic Processor Unit , 2010, ICA3PP.

[5]  G. C. Pomraning,et al.  Linear Transport Theory , 1967 .

[6]  Scott Pakin,et al.  Entering the petaflop era: the architecture and performance of Roadrunner , 2008, HiPC 2008.

[7]  Pradeep Dubey,et al.  Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.

[8]  Yunfei Chen,et al.  GPU accelerated molecular dynamics simulation of thermal conductivities , 2007, J. Comput. Phys..

[9]  Adolfy Hoisie,et al.  Scalability analysis of multidimensional wavefront algorithms on large-scale SMP clusters , 1999, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[10]  Shigeomi Chono,et al.  GPU-accelerated molecular dynamics simulation for study of liquid crystalline flows , 2010, J. Comput. Phys..

[11]  Nancy M. Amato,et al.  A general performance model for parallel sweeps on orthogonal grids for particle transport calculations , 2000, ICS '00.

[12]  Joshua A. Anderson,et al.  General purpose molecular dynamics simulations fully implemented on graphics processing units , 2008, J. Comput. Phys..

[13]  Uri C. Weiser,et al.  Proceedings of the 37th annual international symposium on Computer architecture , 2010, ISCA 2010.

[14]  Eric Darve,et al.  Large calculation of the flow over a hypersonic vehicle using a GPU , 2008, J. Comput. Phys..

[15]  K. D. Lathrop Spatial differencing of the transport equation: Positivity vs. accuracy , 1969 .

[16]  B. R. Wienke,et al.  Parallel S /sub n/ iteration schemes , 1985 .

[17]  Diego Rossinelli,et al.  GPU accelerated simulations of bluff body flows using vortex particle methods , 2010, J. Comput. Phys..

[18]  Gordon Erlebacher,et al.  High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster , 2010, J. Comput. Phys..

[19]  Greg Johnson,et al.  Implementation and performance modeling of deterministic particle transport (Sweep3D) on the IBM Cell/B.E. , 2009, HiPC 2009.

[20]  I. Lux Monte Carlo Particle Transport Methods: Neutron and Photon Calculations , 1991 .

[21]  Patricia J. Teller,et al.  Proceedings of the 2008 ACM/IEEE conference on Supercomputing , 2008, HiPC 2008.

[22]  G. I. Bell,et al.  Nuclear Reactor Theory , 1952 .

[23]  Christon,et al.  Spatial domain-based parallelism in large-scale, participating-media, radiative transport applications , 1997 .

[24]  Stephen W. Poole,et al.  Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors , 2010, J. Comput. Phys..

[25]  Marvin L. Adams,et al.  Diffusion Synthetic Acceleration of Discontinuous Finite Element Transport Iterations , 1992 .

[26]  Wolfgang Paul,et al.  GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model , 2009, J. Comput. Phys..

[27]  E. Lewis,et al.  Computational Methods of Neutron Transport , 1993 .