The Dynamic Kernel Scheduler - Part 1

Abstract Emerging processor architectures such as GPUs and Intel MICs provide a huge performance potential for high performance computing. However developing software that uses these hardware accelerators introduces additional challenges for the developer. These challenges may include exposing increased parallelism, handling different hardware designs, and using multiple development frameworks in order to utilise devices from different vendors. The Dynamic Kernel Scheduler (DKS) is being developed in order to provide a software layer between the host application and different hardware accelerators. DKS handles the communication between the host and the device, schedules task execution, and provides a library of built-in algorithms. Algorithms available in the DKS library will be written in CUDA, OpenCL, and OpenMP. Depending on the available hardware, the DKS can select the appropriate implementation of the algorithm. The first DKS version was created using CUDA for the Nvidia GPUs and OpenMP for Intel MIC. DKS was further integrated into OPAL (Object-oriented Parallel Accelerator Library) in order to speed up a parallel FFT based Poisson solver and Monte Carlo simulations for particle–matter interaction used for proton therapy degrader modelling. DKS was also used together with Minuit2 for parameter fitting, where χ 2 and max-log-likelihood functions were offloaded to the hardware accelerator. The concepts of the DKS, first results, and plans for the future will be shown in this paper.

[1]  J. W. Eastwood,et al.  Remarks on the solution of poisson's equation for isolated systems , 1979 .

[2]  Viktor K. Decyk,et al.  Adaptable Particle-in-Cell algorithms for graphical processing units , 2010, Comput. Phys. Commun..

[3]  Alan D. Martin,et al.  Review of Particle Physics , 2014, 1412.1408.

[4]  Christopher D. Carothers,et al.  Comparison of Two Accelerators for Monte Carlo Radiation Transport Calculations, NVIDIA Tesla M2090 GPU and Intel Xeon Phi 5110p Coprocessor: A Case Study for X-ray CT Imaging Dose Calculation , 2014, ICS 2014.

[5]  Xun Jia,et al.  GPU-based fast Monte Carlo simulation for radiotherapy dose calculation. , 2011, Physics in medicine and biology.

[6]  Ely M. Gelbard,et al.  Methods in Computational Physics, Vol. I , 1964 .

[7]  A. Suter,et al.  Musrfit: A Free Platform-Independent Framework for μSR Data Analysis , 2011, 1111.1569.

[8]  Bo Joel Svensson,et al.  GPGPU kernel implementation and refinement using Obsidian , 2010, ICCS.

[9]  R W Hockney,et al.  Computer Simulation Using Particles , 1966 .

[10]  Christopher D. Carothers,et al.  Comparison of Two Accelerators for Monte Carlo Radiation Transport Calculations, NVIDIA Tesla M2090 GPU and Intel Xeon Phi 5110p Coprocessor: A Case Study for X-ray CT Imaging Dose Calculation , 2014, ICS 2014.

[11]  A. Adelmann,et al.  Beam dynamics in high intensity cyclotrons including neighboring bunch effects: Model, implementation, and application , 2010, 1003.0326.

[12]  Christopher D. Carothers,et al.  ARCHER, a new Monte Carlo software tool for emerging heterogeneous computing environments , 2015 .

[13]  Diego Rossinelli,et al.  GPU accelerated simulations of bluff body flows using vortex particle methods , 2010, J. Comput. Phys..

[14]  Stefan Goedecker,et al.  A customized 3D GPU Poisson solver for free boundary conditions , 2013, Comput. Phys. Commun..

[15]  Joseph JáJá,et al.  High Performance FFT Based Poisson Solver on a CPU-GPU Heterogeneous Platform , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[16]  Sani R. Nassif,et al.  Hardware Acceleration of an Efficient and Accurate Proton Therapy Monte Carlo , 2013, ICCS.

[17]  James Tickner,et al.  Monte Carlo simulation of X-ray and gamma-ray photon transport on a graphics-processing unit , 2010, Comput. Phys. Commun..

[19]  J F Ziegler,et al.  Comments on ICRU report no. 49: stopping powers and ranges for protons and alpha particles. , 1999, Radiation research.

[20]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[21]  W. Leo,et al.  Techniques for Nuclear and Particle Physics Experiments , 1987 .

[22]  J. Gillis,et al.  Methods in Computational Physics , 1964 .

[23]  Joseph JáJá,et al.  Optimized FFT computations on heterogeneous platforms with application to the Poisson equation , 2014, J. Parallel Distributed Comput..

[24]  T. Pawlicki,et al.  Proton therapy dose calculations on GPU: advances and challenges , 2012 .

[25]  Mathias Bourgoin,et al.  Efficient Abstractions for GPGPU Programming , 2013, International Journal of Parallel Programming.

[26]  A. Adelmann,et al.  Towards quantitative simulations of high power proton cyclotrons , 2010, 1012.0718.

[27]  D. A. Dunnett Classical Electrodynamics , 2020, Nature.

[28]  Basilio B. Fraguela,et al.  Improving OpenCL Programmability with the Heterogeneous Programming Library , 2015, ICCS.