Portably Improving Uintah ’ s Readiness for Exascale Systems Through the Use of Kokkos
暂无分享,去创建一个
Daniel Sunderland | John K. Holmen | Alan Humphrey | Jeremy N. Thornock | Brad Peterson | H. Oscar | Dı́az-Ibarra
[1] Martin Berzins,et al. A Scalable Algorithm for Radiative Heat Transfer Using Reverse Monte Carlo Ray Tracing , 2015, ISC.
[2] Hartmut Kaiser,et al. HPX: A Task Based Programming Model in a Global Address Space , 2014, PGAS.
[3] Sebastiano Vigna,et al. An Experimental Exploration of Marsaglia's xorshift Generators, Scrambled , 2014, ACM Trans. Math. Softw..
[4] Qingyu Meng,et al. Investigating applications portability with the uintah DAG-based runtime system on petascale supercomputers , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[5] Justin Luitjens,et al. Dynamic task scheduling for the Uintah framework , 2010, 2010 3rd Workshop on Many-Task Computing on Grids and Supercomputers.
[6] M. Zingale,et al. Meeting the Challenges of Modeling Astrophysical Thermonuclear Explosions: Castro, Maestro, and the AMReX Astrophysics Suite , 2017, 1711.06203.
[7] Justin Luitjens,et al. Uintah: a scalable framework for hazard analysis , 2010, TG.
[8] Jeremy N. Thornock,et al. Application of LES-CFD for predicting pulverized-coal working conditions after installation of NOx control system , 2018, Energy.
[9] Martin Berzins,et al. Improving Uintah's Scalability Through the Use of Portable Kokkos-Based Data Parallel Tasks , 2017, PEARC.
[10] Daniel Sunderland,et al. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..
[11] Martin Berzins,et al. Demonstrating GPU code portability and scalability for radiative heat transfer computations , 2018, J. Comput. Sci..
[12] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[13] Martin Berzins,et al. Chapter 13 – Exploring Use of the Reserved Core , 2015 .
[14] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[15] Richard D. Hornung,et al. The RAJA Portability Layer: Overview and Status , 2014 .
[16] Martin Berzins,et al. An Overview of Performance Portability in the Uintah Runtime System through the Use of Kokkos , 2016, 2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2).
[17] John Shalf,et al. The Cactus Framework and Toolkit: Design and Applications , 2002, VECPAR.
[18] Christon,et al. Spatial domain-based parallelism in large scale, participating-media, radiative transport applications , 1996 .
[19] Qingyu Meng,et al. Using hybrid parallelism to improve memory use in the Uintah framework , 2011 .
[20] Matt Martineau,et al. An Evaluation of Emerging Many-Core Parallel Programming Models , 2016, PMAM@PPoPP.
[21] Steve Plimpton,et al. Fast parallel algorithms for short-range molecular dynamics , 1993 .
[22] Alexander Aiken,et al. Legion: Expressing locality and independence with logical regions , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[23] Martin Berzins,et al. A Preliminary Port and Evaluation of the Uintah AMT Runtime on Sunway TaihuLight , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[24] Thomas Hérault,et al. PaRSEC: Exploiting Heterogeneity to Enhance Scalability , 2013, Computing in Science & Engineering.
[25] Jennifer Spinti,et al. Large eddy simulations of accidental fires using massively parallel computers , 2003 .
[26] Jeremy N. Thornock,et al. Large eddy simulation of polydisperse particles in turbulent coaxial jets using the direct quadrature method of moments , 2014 .
[27] Qingyu Meng,et al. Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system , 2012, XSEDE '12.
[28] Tamara G. Kolda,et al. An overview of the Trilinos project , 2005, TOMS.
[29] Martin Berzins,et al. Radiative Heat Transfer Calculation on 16384 GPUs Using a Reverse Monte Carlo Ray Tracing Approach with Adaptive Mesh Refinement , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).