ASC ATDM Level 2 Milestone #5325: Asynchronous Many-Task Runtime System Analysis and Assessment for Next Generation Platforms
暂无分享,去创建一个
Martin Berzins | Stephen L. Olivier | Simon David Hammond | Paul Lin | Todd Gamblin | Michael Bauer | Peer-Timo Bremer | Matthew Tyler Bettencourt | Ryan E. Grant | Janine Camille Bennett | Robert L. Clay | Wonchan Lee | Alex Aiken | Nicole Lemaster Slattengren | Dan Sunderland | Keita Teranishi | Todd Harman | Gregory D. Sjaardema | Hemanth Kolla | Marc Gamell | John A. Schmidt | Nikhil Jain | Steven W. Bova | Jeremiah J. Wilke | Samuel Keith Gutierrez | Elliott Slaughter | Sean J. Treichler | Eric Mikida | Pat McCormick | Samuel Knight | Gavin Matthew Baker | David S. Hollman | Ken Franko | Laxkimant Kale | Alan Humphreys | Martin Shulz | M. Berzins | A. Aiken | Michael Bauer | Sean Treichler | Elliott Slaughter | P. McCormick | K. Teranishi | P. Bremer | Wonchan Lee | Janine Bennett | S. Bova | Nikhil Jain | Eric Mikida | S. Hammond | Gavin Baker | M. Bettencourt | Kenneth J. Franko | Marc Gamell | D. Hollman | Samuel Knight | H. Kolla | P. Lin | G. Sjaardema | N. Slattengren | R. Clay | Laxkimant Kale | T. Harman | Alan Humphreys | John A. Schmidt | Daniel Sunderland | S. Gutierrez | M. Shulz | T. Gamblin | Michael A. Bauer
[1] Carl Hewitt,et al. A Universal Modular ACTOR Formalism for Artificial Intelligence , 1973, IJCAI.
[2] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[3] Laxmikant V. Kalé,et al. The Chare Kernel Parallel Programming Language and System , 1990, ICPP.
[4] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[5] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[6] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[7] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[8] Laxmikant V. Kalé,et al. Multiparadigm, Multilingual Interoperability: Experience with Converse , 1998, IPPS/SPDP Workshops.
[9] Edward A. Luke,et al. Loci: A Deductive Framework for Graph-Based Algorithms , 1999, ISCOPE.
[10] William Gropp,et al. Toward Scalable Performance Visualization with Jumpshot , 1999, Int. J. High Perform. Comput. Appl..
[11] George Ho,et al. PAPI: A Portable Interface to Hardware Performance Counters , 1999 .
[12] Steven G. Parker,et al. Uintah: a massively parallel problem solving environment , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.
[13] Nancy M. Amato,et al. STAPL: An Adaptive, Generic Parallel C++ Library , 2001, LCPC.
[14] Dan Bonachea. GASNet Specification, v1.1 , 2002 .
[15] B.P. Miller,et al. MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[16] Peter Van Roy,et al. Concepts, Techniques, and Models of Computer Programming , 2004 .
[17] Sameer Kumar,et al. Scalable fine‐grained parallelization of plane‐wave–based ab initio molecular dynamics for large supercomputers , 2004, J. Comput. Chem..
[18] Laxmikant V. Kalé,et al. Debugging support for Charm++ , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[19] Allen D. Malony,et al. Performance Analysis Integration in the Uintah Software Development Cycle , 2003, International Journal of Parallel Programming.
[20] Laxmikant V. Kalé,et al. Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..
[21] Laxmikant V. Kalé,et al. Scaling applications to massively parallel machines using Projections performance analysis tool , 2006, Future Gener. Comput. Syst..
[22] William J. Dally,et al. Sequoia: Programming the Memory Hierarchy , 2006, International Conference on Software Composition.
[23] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[24] Amitabh Sinha,et al. Projections : A Preliminary Performance Tool for Charm , 2007 .
[25] L. Kalé,et al. Towards Petascale Cosmological Simulations with ChaNGa , 2007 .
[26] Scott Klasky,et al. Terascale direct numerical simulations of turbulent combustion using S3D , 2008 .
[27] Laxmikant V. Kalé,et al. Massively parallel cosmological simulations with ChaNGa , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[28] Kevin Skadron,et al. Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[29] Laxmikant V. Kalé,et al. Continuous performance monitoring for large-scale parallel applications , 2009, 2009 International Conference on High Performance Computing (HiPC).
[30] Laxmikant V. Kalé,et al. Integrated Performance Views in Charm++: Projections Meets TAU , 2009, 2009 International Conference on Parallel Processing.
[31] William Gropp. MPI at Exascale: Challenges for Data Structures and Algorithms , 2009, PVM/MPI.
[32] John E. Stone,et al. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.
[33] Laxmikant V. Kalé,et al. Debugging Large Scale Applications in a Virtualized Environment , 2010, LCPC.
[34] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..
[35] Justin Luitjens,et al. Improving the performance of Uintah: A large-scale adaptive meshing computational framework , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[36] Kesheng Wu,et al. Scientific Discovery at the Exascale , 2011 .
[37] S. Dosanjh,et al. Architectures and Technology for Extreme Scale Computing Report from the Workshop Node Architecture and Power Reduction Strategies , 2011 .
[38] Alexander Aiken,et al. Legion: Expressing locality and independence with logical regions , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[39] Qingyu Meng,et al. Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system , 2012, XSEDE '12.
[40] Qingyu Meng,et al. The uintah framework: a unified heterogeneous task scheduling and runtime system , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[41] Thomas Heller,et al. Application of the ParalleX execution model to stencil-based problems , 2012, Computer Science - Research and Development.
[42] Mark Anders,et al. Near-threshold voltage (NTV) design — Opportunities and challenges , 2012, DAC Design Automation Conference 2012.
[43] Laxmikant V. Kalé,et al. A distributed dynamic load balancer for iterative applications , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[44] Alexander Aiken,et al. Language support for dynamic, hierarchical data partitioning , 2013, OOPSLA.
[45] Robert Dietrich,et al. OMPT: An OpenMP Tools Application Programming Interface for Performance Analysis , 2013, IWOMP.
[46] James H. Laros,et al. PowerInsight - A commodity power measurement capability , 2013, 2013 International Green Computing Conference Proceedings.
[47] Qingyu Meng,et al. Preliminary experiences with the uintah framework on Intel Xeon Phi and stampede , 2013, XSEDE.
[48] Laxmikant V. Kalé,et al. ACR: Automatic checkpoint/restart for soft and hard error protection , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[49] Laxmikant V. Kalé,et al. Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[50] Alexander Aiken,et al. Structure Slicing: Extending Logical Regions with Fields , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[51] Franck Cappello,et al. Addressing failures in exascale computing , 2014, Int. J. High Perform. Comput. Appl..
[52] Lukasz Wesolowski,et al. Adaptive techniques for clustered N-body cosmological simulations , 2014, 1409.1929.
[53] L. Kalé,et al. Charm + + & MPI : Combining the Best of Both Worlds , 2014 .
[54] Laxmikant V. Kalé,et al. Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[55] Daniel Sunderland,et al. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..
[56] Laxmikant V. Kalé,et al. PICS: a performance-analysis-based introspective control system to steer parallel applications , 2014, ROSS@ICS.
[57] Anthony Skjellum,et al. Design and Evaluation of FA-MPI, a Transactional Resilience Scheme for Non-blocking MPI , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[58] Alexander Aiken,et al. Realm: An event-based low-level runtime for distributed memory architectures , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[59] Scott Klasky,et al. Exploring Automatic, Online Failure Recovery for Scientific Applications at Extreme Scales , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[60] Abhishek Gupta,et al. Parallel Programming with Migratable Objects: Charm++ in Practice , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[61] Bernd Hamann,et al. Dissecting On-Node Memory Access Performance: A Semantic Approach , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[62] Michael Bauer. Legion: Programming Distributed Heterogeneous Architectures with Logical Regions , 2014 .
[63] Laxmikant V. Kalé,et al. Scalable replay with partial-order dependencies for message-logging fault tolerance , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).
[64] John Shalf,et al. Abstract Machine Models and Proxy Architectures for Exascale Computing , 2014, 2014 Hardware-Software Co-Design for High Performance Computing.
[65] Richard D. Hornung,et al. The RAJA Portability Layer: Overview and Status , 2014 .
[66] Bernd Hamann,et al. Combing the Communication Hairball: Visualizing Parallel Execution Traces using Logical Time , 2014, IEEE Transactions on Visualization and Computer Graphics.
[67] Michael A. Heroux,et al. Toward Local Failure Local Recovery Resilience Model using MPI-ULFM , 2014, EuroMPI/ASIA.
[68] Martin Berzins,et al. A Scalable Algorithm for Radiative Heat Transfer Using Reverse Monte Carlo Ray Tracing , 2015, ISC.
[69] Martin Schulz,et al. A Flexible Data Model to Support Multi-domain Performance Analysis , 2015 .
[70] Paul Lin,et al. CFD for Next Generation Hardware: Experiences with Proxy Applications. , 2015 .
[71] Bernd Hamann,et al. Recovering logical structure from Charm++ event traces , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[72] Alexander Aiken,et al. Regent: a high-productivity programming language for HPC with logical regions , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[73] Bronis R. de Supinski,et al. The Spack package manager: bringing order to HPC software chaos , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[74] Charles R. Ferenbaugh,et al. PENNANT: an unstructured mesh mini‐app for advanced architecture research , 2015, Concurr. Comput. Pract. Exp..
[75] Laxmikant V. Kalé,et al. A Fault-Tolerance Protocol for Parallel Applications with Communication Imbalance , 2015, 2015 27th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).
[76] Laxmikant V. Kalé,et al. Using Migratable Objects to Enhance Fault Tolerance Schemes in Supercomputers , 2015, IEEE Transactions on Parallel and Distributed Systems.