Performance Modeling of Gyrokinetic Toroidal Simulations for a Many-Tasking Runtime System
暂无分享,去创建一个
Thomas L. Sterling | Matthew Anderson | Maciej Brodowicz | Abhishek Kulkarni | T. Sterling | M. Brodowicz | Matthew Anderson | Abhishek Kulkarni
[1] Douglas Thain,et al. Qthreads: An API for programming with millions of lightweight threads , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[2] Arch D. Robison,et al. Structured Parallel Programming: Patterns for Efficient Computation , 2012 .
[3] Gilbert Hendry,et al. SST: A Simulator for Exascale Co-design. , 2012 .
[4] M. Brodowicz,et al. Application Characteristics of Many-tasking Execution Models , 2013 .
[5] Victor Luchangco,et al. The Fortress Language Specification Version 1.0 , 2007 .
[6] Guang R. Gao,et al. ParalleX: A Study of A New Parallel Computation Model , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[7] Henry G. Baker,et al. Actors and Continuous Functionals , 1978, Formal Description of Programming Concepts.
[8] Samuel Williams,et al. Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[9] John Shalf,et al. NERSC-6 Workload Analysis and Benchmark Selection Process , 2008 .
[10] A. Lumsdaine,et al. LogGOPSim: simulating large-scale applications in the LogGOPS model , 2010, HPDC '10.
[11] Robert H. Halstead,et al. MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.
[12] Bradford L. Chamberlain,et al. Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..
[13] B. J. Muga,et al. Particle-in-Cell Method , 1970 .
[14] Alice Koniges,et al. Application Acceleration on Current and Future Cray Platforms , 2010 .
[15] Martin Schulz,et al. Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[16] Stephen L. Olivier,et al. Comparison of OpenMP 3.0 and Other Task Parallel Frameworks on Unbalanced Task Graphs , 2010, International Journal of Parallel Programming.
[17] Arch D. Robison,et al. Chapter 3 – Patterns , 2012 .
[18] James Reinders,et al. Intel threading building blocks - outfitting C++ for multi-core processor parallelism , 2007 .
[19] S. Ethier,et al. Gyrokinetic particle-in-cell simulations of plasma microturbulence on advanced computing platforms , 2005 .
[20] Roger W. Hockney,et al. The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.
[21] Jack J. Dongarra,et al. Performance analysis of MPI collective operations , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[22] Xingfu Wu,et al. Performance Modeling of Hybrid MPI/OpenMP Scientific Applications on Large-scale Multicore Cluster Systems , 2011, 2011 14th IEEE International Conference on Computational Science and Engineering.
[23] Tarek El-Ghazawi,et al. Evaluation of UPC on the Cray X1 , 2005 .
[24] Torsten Hoefler,et al. Performance modeling for systematic performance tuning , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[25] Mark M. Mathis,et al. A performance model of non-deterministic particle transport on large-scale systems , 2003, Future Gener. Comput. Syst..
[26] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[27] Michael Haupt,et al. A comparison of context-oriented programming languages , 2009, COP@ECOOP.
[28] John M. Mellor-Crummey,et al. Managing Asynchronous Operations in Coarray Fortran 2.0 , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[29] Rajeev Thakur,et al. Hybrid parallel programming with MPI and unified parallel C , 2010, Conf. Computing Frontiers.
[30] Thomas L. Sterling,et al. Improving the scalability of parallel N-body applications with an event-driven constraint-based execution model , 2012, Int. J. High Perform. Comput. Appl..
[31] Gilbert Hendry. Decreasing Network Power with on-off Links Informed by Scientific Applications , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[32] Thomas L. Sterling,et al. ParalleX An Advanced Parallel Execution Model for Scaling-Impaired Applications , 2009, 2009 International Conference on Parallel Processing Workshops.
[33] Steven A. Hofmeyr,et al. Oversubscription on multicore processors , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[34] Bruno Raffin,et al. XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[35] Franck Cappello,et al. MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[36] Alejandro Duran,et al. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP , 2009, 2009 International Conference on Parallel Processing.