Optimizing communication in HPF programs on fine-grain distributed shared memory
暂无分享,去创建一个
[1] Kourosh Gharachorloo,et al. Shasta: a low overhead, software-only approach for supporting fine-grain shared memory , 1996, ASPLOS VII.
[2] Evgenia Smirni,et al. The KSR1: experimentation and modeling of poststore , 1993, SIGMETRICS '93.
[3] Ken Kennedy,et al. An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..
[4] S.K. Reinhardt,et al. Decoupled Hardware Support for Distributed Shared Memory , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[5] Anoop Gupta,et al. Comparative evaluation of latency reducing and tolerating techniques , 1991, ISCA '91.
[6] Kevin P. McAuliffe,et al. Automatic Management of Programmable Caches , 1988, ICPP.
[7] Kevin P. McAuliffe,et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.
[8] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[9] Chau-Wen Tseng. An optimizing Fortran D compiler for MIMD distributed-memory machines , 1993 .
[10] Edith Schonberg,et al. A Unified Framework for Optimizing Communication in Data-Parallel Programs , 1996, IEEE Trans. Parallel Distributed Syst..
[11] Anne Rogers. Compiling for locality of reference , 1990 .
[12] James R. Larus,et al. Cooperative shared memory: software and hardware for scalable multiprocessor , 1992, ASPLOS V.
[13] Charles Koelbel,et al. Compiling Global Name-Space Parallel Loops for Distributed Execution , 1991, IEEE Trans. Parallel Distributed Syst..
[14] Ruby B. Lee,et al. Tempest: a substrate for portable parallel programs , 1995 .
[15] Margaret Martonosi,et al. Evaluating the impact of advanced memory systems on compiler-parallelized codes , 1995, PACT.
[16] James R. Larus,et al. Implementing Fine-grain Distributed Shared Memory on Commodity SMP Workstations , 1996 .
[17] T. Lovett,et al. STiNG: A CC-NUMA Computer System for the Commercial Marketplace , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[18] Alan L. Cox,et al. Evaluating the performance of software distributed shared memory as a target for parallelizing compilers , 1997, Proceedings 11th International Parallel Processing Symposium.
[19] Willy Zwaenepoel,et al. Implementation and performance of Munin , 1991, SOSP '91.
[20] James R. Larus,et al. Teapot: language support for writing memory coherence protocols , 1996, PLDI '96.
[21] Monica S. Lam,et al. Data and computation transformations for multiprocessors , 1995, PPOPP '95.
[22] Alexander V. Veidenbaum,et al. Compiler-directed cache management in multiprocessors , 1990, Computer.
[23] Josep Torrellas,et al. Data forwarding in scalable shared-memory multiprocessors , 1995, ICS '95.
[24] Alan L. Cox,et al. An integrated compile-time/run-time software distributed shared memory system , 1996, ASPLOS VII.
[25] James R. Larus,et al. HPF on Fine-Grain Distributed Shared Memory: Early Experience , 1996, LCPC.
[26] James R. Larus,et al. Application-specific protocols for user-level shared memory , 1994, Proceedings of Supercomputing '94.
[27] James R. Larus,et al. Efficient support for irregular applications on distributed-memory machines , 1995, PPOPP '95.
[28] James R. Larus,et al. Tempest: a substrate for portable parallel programs , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.
[29] Alan L. Cox,et al. TreadMarks: shared memory computing on networks of workstations , 1996 .
[30] Anoop Gupta,et al. The Stanford FLASH Multiprocessor , 1994, ISCA.
[31] William Pugh,et al. Eliminating false data dependences using the Omega test , 1992, PLDI '92.
[32] James R. Larus,et al. LCM: memory system support for parallel language implementation , 1994, ASPLOS VI.
[33] Chau-Wen Tseng,et al. Enhancing software DSM for compiler-parallelized applications , 1997, Proceedings 11th International Parallel Processing Symposium.
[34] James R. Larus,et al. Cooperative shared memory: software and hardware for scalable multiprocessors , 1993, TOCS.
[35] James R. Larus,et al. Tempest and typhoon: user-level shared memory , 1994, ISCA '94.