The trade-off between implicit and explicit data distribution in shared-memory programming paradigms
暂无分享,去创建一个
Eduard Ayguadé | Jesús Labarta | Dimitrios S. Nikolopoulos | Constantine D. Polychronopoulos | Theodore S. Papatheodorou
[1] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[2] Joel H. Saltz,et al. Runtime compilation techniques for data partitioning and communication schedule reuse , 1993, Supercomputing '93. Proceedings.
[3] Michael Frumkin,et al. Implementation of NAS Parallel Benchmarks in High Performance Fortran , 2000 .
[4] Eduard Ayguadé,et al. A case for user-level dynamic page migration , 2000, ICS '00.
[5] Eduard Ayguadé,et al. Is Data Distribution Necessary in OpenMP? , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[6] H. Jin,et al. - 3-The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance , 1999 .
[7] Evangelos P. Markatos,et al. Using processor affinity in loop scheduling on shared-memory multiprocessors , 1992, Supercomputing '92.
[8] Hugh Garraway. Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.
[9] Nenad Nedeljkovic,et al. Data distribution support on distributed shared memory multiprocessors , 1997, PLDI '97.
[10] Joel H. Saltz,et al. Run-time parallelization and scheduling of loops. Contract report , 1988 .
[11] Multiprocessors. Using Processor A � nity in Loop Scheduling on Shared Memory , 1994 .
[12] Joel H. Saltz,et al. Run-Time Parallelization and Scheduling of Loops , 1991, IEEE Trans. Computers.
[13] Eduard Ayguadé,et al. UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors , 2000, LCR.
[14] Leonid Oliker,et al. A Comparison of Three Programming Models for Adaptive Applications on the Origin2000 , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[15] Ricardo Bianchini,et al. Using simple page placement policies to reduce the cost of cache fills in coherent shared-memory systems , 1995, Proceedings of 9th International Parallel Processing Symposium.
[16] John S. Keen,et al. Measuring Memory Hierarchy Performance of Cache-Coherent Multiprocessors Using Micro Benchmarks , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[17] Siegfried Benkner,et al. Exploiting Data Locality on Scalable Shared Memory Machines with Data Parallel Programs , 2000, Euro-Par.
[18] Jonathan Harris,et al. Extending OpenMP For NUMA Machines , 2000, ACM/IEEE SC 2000 Conference (SC'00).