Towards general and exact distributed invalidation
暂无分享,去创建一个
[1] David A. Wood,et al. Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[2] Kam-Fai Wong,et al. EDS: A Parallel Computer System for Advanced Information Processing , 1992, PARLE.
[3] Pen-Chung Yew,et al. A compiler-directed cache coherence scheme with improved intertask locality , 1994, Proceedings of Supercomputing '94.
[4] Hoichi Cheong,et al. Life span strategy—a compiler-based approach to cache coherence , 1992, ICS '92.
[5] James R. Larus,et al. Cooperative Shared Memory: Software and Hardware Support for Scalable Multiprocesors , 1992, International Conference on Architectural Support for Programming Languages and Operating Systems.
[6] Anant Agarwal,et al. LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.
[7] Alexander V. Veidenbaum,et al. Compiler-directed cache management in multiprocessors , 1990, Computer.
[8] Willy Zwaenepoel,et al. Munin: distributed shared memory based on type-specific memory coherence , 1990, PPOPP '90.
[9] Pen-Chung Yew,et al. Compiler Analysis for Cache Coherence: Interprocedural Array Data-Flow Analysis and Its Impact on Cache Performance , 2000, IEEE Trans. Parallel Distributed Syst..
[10] Michael F. P. O'Boyle,et al. Compiler Reduction of Invalidation Traffic in Virtual Shared Memory Systems , 1996, Euro-Par, Vol. I.
[11] Ken Kennedy,et al. Automatic software cache coherence through vectorization , 1992, ICS '92.
[12] James R. Larus,et al. Cachier: A Tool for Automatically Inserting CICO Annotations , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[13] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[14] Per Stenström,et al. Evaluation of Compiler-Controlled Updating to Reduce Coherence-Miss Penalties in Shared-Memory Multiprocessors , 1999, J. Parallel Distributed Comput..
[15] Stefanos Kaxiras,et al. Identification and optimization of sharing patterns for scalable shared-memory multiprocessors , 1998 .
[16] Ian Watson,et al. An evaluation of DELTA, a decoupled pre-fetching virtual shared memory system , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.
[17] Babak Falsafi,et al. Memory sharing predictor: the key to a speculative coherent DSM , 1999, ISCA.
[18] Per Stenström,et al. Simple compiler algorithms to reduce ownership overhead in cache coherence protocols , 1994, ASPLOS VI.
[19] Vivek Sarkar,et al. Array SSA form and its use in parallelization , 1998, POPL '98.
[20] William Pugh,et al. The Omega Library interface guide , 1995 .
[21] Mark D. Hill,et al. Using prediction to accelerate coherence protocols , 1998, ISCA.
[22] Mats Brorsson,et al. An adaptive cache coherence protocol optimized for migratory sharing , 1993, ISCA '93.
[23] Ralph Grishman,et al. The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.
[24] Hermann Hellwagner,et al. SCI: Scalable Coherent Interface: Architecture and Software for High-Performance Compute Clusters , 1999 .
[25] Michael F. P. O'Boyle,et al. A graph based approach to barrier synchronisation minimisation , 1997, ICS '97.
[26] K. Kennedy,et al. Cache coherence using local knowledge , 1993, Supercomputing '93.
[27] James R. Larus,et al. Cooperative shared memory: software and hardware for scalable multiprocessor , 1992, ASPLOS V.
[28] Babak Falsafi,et al. Selective, accurate, and timely self-invalidation using last-touch prediction , 2000, ISCA '00.
[29] Michael F. P. O'Boyle,et al. A compiler algorithm to reduce invalidation latency in virtual shared memory systems , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[30] Michael F. P. O'Boyle,et al. Exact Distributed Invalidation , 2000, Euro-Par.