TiNy threads on BlueGene/P: Exploring many-core parallelisms beyond The traditional OS
暂无分享,去创建一个
Guang R. Gao | Aaron Landwehr | Handong Ye | Robert S. Pavel | G. Gao | A. Landwehr | R. Pavel | Handong Ye
[1] Katherine Yelick,et al. Introduction to UPC and Language Specification , 2000 .
[2] Bradford L. Chamberlain,et al. Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..
[3] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.
[4] Marcos K. Aguilera,et al. Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.
[5] Jarek Nieplocha,et al. Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit , 2006, Int. J. High Perform. Comput. Appl..
[6] Katherine Yelick,et al. Titanium: a high-performance Java dialect , 1998 .
[7] Philip Heidelberger,et al. The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer , 2008, ICS '08.
[8] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.
[9] Robert W. Numrich,et al. Co-array Fortran for parallel programming , 1998, FORF.
[10] Guang R. Gao,et al. TiNy threads: a thread virtual machine for the Cyclops64 cellular architecture , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[11] Kamil Iskra,et al. Characterizing the Performance of “Big Memory” on Blue Gene Linux , 2009, 2009 International Conference on Parallel Processing Workshops.
[12] Guillaume Mercier,et al. Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem , 2007, Parallel Comput..
[13] Cezary Dubnicki,et al. VMMC-2 : Efficient Support for Reliable, Connection-Oriented Communication , 1997 .
[14] Laxmikant V. Kalé,et al. Performance evaluation of adaptive MPI , 2006, PPoPP '06.
[15] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[16] Dhabaleswar K. Panda,et al. High Performance Remote Memory Access Communication: The Armci Approach , 2006, Int. J. High Perform. Comput. Appl..
[17] Robert J. Harrison,et al. Global Arrays: a portable "shared-memory" programming model for distributed memory computers , 1994, Proceedings of Supercomputing '94.
[18] Anthony Skjellum,et al. Using MPI - portable parallel programming with the message-parsing interface , 1994 .
[19] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[20] Juan del Cuvillo,et al. Breaking away from the OS shadow: A program execution model aware thread virtual machine for multicore architectures , 2008 .
[21] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[22] Robert D. Blumofe,et al. Adaptive and Reliable ParallelComputing9 Networks of Workstations , 1997 .
[23] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[24] Katherine A. Yelick,et al. Scaling communication-intensive applications on BlueGene/P using one-sided communication and overlap , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.