Evaluation of PGAS Communication Paradigms with Geometric Multigrid
暂无分享,去创建一个
[1] Michael Garland,et al. Designing a unified programming model for heterogeneous machines , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[2] Katherine A. Yelick,et al. Titanium: A High-performance Java Dialect , 1998, Concurr. Pract. Exp..
[3] Daniel Grünewald. BQCD with GPI: A case study , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).
[4] Dan Bonachea. GASNet Specification, v1.1 , 2002 .
[5] Samuel Williams,et al. Optimization of geometric multigrid for emerging multi- and manycore processors , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[6] Rui Machado,et al. Unbalanced tree search on a manycore system using the GPI programming model , 2011, Computer Science - Research and Development.
[7] Marc Snir,et al. Optimizing the Barnes-Hut algorithm in UPC , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[8] Katherine A. Yelick,et al. A Local-View Array Library for Partitioned Global Address Space C++ Programs , 2014, ARRAY@PLDI.
[9] Torsten Hoefler,et al. Enabling highly-scalable remote memory access programming with MPI-3 one sided , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[10] Bradford L. Chamberlain,et al. Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..
[11] William N. Scherer,et al. A new vision for coarray Fortran , 2009, PGAS '09.
[12] Leonid Oliker,et al. Implementation and Optimization of miniGMG - a Compact Geometric Multigrid Benchmark , 2012 .
[13] Tarek A. El-Ghazawi,et al. UPC Performance and Potential: A NPB Experimental Study , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[14] Phillip Colella,et al. Adaptive mesh refinement in Titanium , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[15] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[16] Barbara M. Chapman,et al. Performance Analysis of the NWChem TCE for Different Communication Patterns , 2013, PMBS@SC.
[17] Dan Bonachea. Proposal for extending the upc memory copy library functions and supporting extensions to gasnet , 2004 .
[18] Nicholas J. Wright,et al. Accelerating Applications at Scale Using One-Sided Communication , 2012 .
[19] Daniel Etiemble,et al. Automatic Task-Based Code Generation for High Performance Domain Specific Embedded Language , 2014, International Journal of Parallel Programming.
[20] Katherine A. Yelick,et al. UPC++: A PGAS Extension for C++ , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[21] Steven J. Deitz,et al. The High-Level Parallel Language ZPL Improves Productivity and Performance , 2004 .
[22] Katherine A. Yelick,et al. Titanium Performance and Potential: An NPB Experimental Study , 2005, LCPC.
[23] Jens Jägersküpper,et al. A PGAS-based Implementation for the Unstructured CFD Solver TAU , 2011 .