High performance radiation transport simulations: Preparing for TITAN

In this paper we describe the Denovo code system. Denovo solves the six-dimensional, steady-state, linear Boltzmann transport equation, of central importance to nuclear technology applications such as reactor core analysis (neutronics), radiation shielding, nuclear forensics and radiation detection. The code features multiple spatial differencing schemes, state-of-the-art linear solvers, the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm for inverting the transport operator, a new multilevel energy decomposition method scaling to hundreds of thousands of processing cores, and a modern, novel code architecture that supports straightforward integration of new features. In this paper we discuss the performance of Denovo on the 20+ petaflop ORNL GPU-based system, Titan. We describe algorithms and techniques used to exploit the capabilities of Titan's heterogeneous compute node architecture and the challenges of obtaining good parallel performance for this sparse hyperbolic PDE solver containing inherently sequential computations. Numerical results demonstrating Denovo performance on early Titan hardware are presented.

[1]  Alejandro Duran,et al.  The Intel® Many Integrated Core Architecture , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).

[2]  Kevin T. Clarno,et al.  Denovo: A New Three-Dimensional Parallel Discrete Ordinates Code in SCALE , 2010 .

[3]  Michael Lang,et al.  Implementation and performance modeling of deterministic particle transport (Sweep3D) on the IBM Cell/B.E , 2009, Sci. Program..

[4]  Jing Xie,et al.  Optimizing Sweep3D for Graphic Processor Unit , 2010, ICA3PP.

[5]  Bucholz,et al.  SCALE: a modular code system for performing standardized computer analyses for licensing evaluation , 1981 .

[6]  Jim E. Morel,et al.  A Transport Acceleration Scheme for Multigroup Discrete Ordinates with Upscattering , 2010 .

[7]  E. Lewis,et al.  Computational Methods of Neutron Transport , 1993 .

[8]  J. Ortega,et al.  Solution of Partial Differential Equations on Vector and Parallel Computers , 1987 .

[9]  J. Xu OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems , 2009 .

[10]  Douglas B. Kothe CASL: The Consortium for Advanced Simulation of Light Water Reactors , 2010 .

[11]  R. Baker,et al.  An Sn algorithm for the massively parallel CM-200 computer , 1998 .

[12]  Fabrizio Petrini,et al.  Multicore Surprises: Lessons Learned from Optimizing Sweep3D on the Cell Broadband Engine , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[13]  Michael Lang,et al.  Adapting wave-front algorithms to efficiently utilize systems with deep communication hierarchies , 2011, Parallel Comput..