Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
暂无分享,去创建一个
[1] Calvin K. Tang. Cache system design in the tightly coupled multiprocessor system , 1976, AFIPS '76.
[2] Paul Feautrier,et al. A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.
[3] Leslie Lamport,et al. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.
[4] Burton J. Smith. Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.
[5] Kevin P. McAuliffe,et al. Automatic Management of Programmable Caches , 1988, ICPP.
[6] Thomas E. Anderson,et al. The Performance Implications of Spin-Waiting Alternatives for Shared-Memory Multiprocessors , 1989, ICPP.
[7] Michel Dubois,et al. Access ordering and coherence in shared memory multiprocessors , 1989 .
[8] Alexander V. Veidenbaum,et al. Compiler-directed cache management in multiprocessors , 1990, Computer.
[9] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[10] Mark D. Hill,et al. Implementing Sequential Consistency in Cache-Based Systems , 1990, ICPP.
[11] M. Hill,et al. Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[12] Anoop Gupta,et al. Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[13] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[14] Anant Agarwal,et al. LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.
[15] T. Mowry,et al. Comparative evaluation of latency reducing and tolerating techniques , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[16] Anoop Gupta,et al. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..
[17] James R. Larus,et al. Cooperative shared memory: software and hardware for scalable multiprocessor , 1992, ASPLOS V.
[18] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[19] Sang Lyul Min,et al. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps , 1992, IEEE Trans. Parallel Distributed Syst..
[20] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[21] Erik Hagersten,et al. DDM - A Cache-Only Memory Architecture , 1992, Computer.
[22] T. von Eicken,et al. Parallel programming in Split-C , 1993, Supercomputing '93.
[23] Robert J. Fowler,et al. Adaptive cache coherency for detecting migratory shared data , 1993, ISCA '93.
[24] Andrea C. Arpaci-Dusseau,et al. Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.
[25] James R. Larus,et al. The Wisconsin Wind Tunnel: virtual prototyping of parallel computers , 1993, SIGMETRICS '93.
[26] Mats Brorsson,et al. An adaptive cache coherence protocol optimized for migratory sharing , 1993, ISCA '93.
[27] James R. Larus,et al. Cooperative shared memory: software and hardware for scalable multiprocessors , 1993, TOCS.
[28] K. Kennedy,et al. Cache coherence using local knowledge , 1993, Supercomputing '93.
[29] Anoop Gupta,et al. The Stanford FLASH Multiprocessor , 1994, ISCA.
[30] James R. Larus,et al. Tempest and typhoon: user-level shared memory , 1994, ISCA '94.
[31] Anoop Gupta,et al. The Stanford FLASH multiprocessor , 1994, ISCA '94.
[32] Ken Chan,et al. PA7200: a PA-RISC processor with integrated high performance MP bus interface , 1994, Proceedings of COMPCON '94.
[33] Pen-Chung Yew,et al. A compiler-directed cache coherence scheme with improved intertask locality , 1994, Proceedings of Supercomputing '94.
[34] James R. Larus,et al. Fine-grain access control for distributed shared memory , 1994, ASPLOS VI.
[35] James R. Larus,et al. Mechanisms for Cooperative Shared Memory , 1994 .
[36] P. Stenström,et al. Combined performance gains of simple cache protocol extensions , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[37] James R. Larus,et al. Cachier: A Tool for Automatically Inserting CICO Annotations , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[38] Michel Dubois,et al. Combined performance gains of simple cache protocol extensions , 1994, ISCA '94.
[39] Alan L. Cox,et al. TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.
[40] Gregory R. Andrews,et al. Distributed filaments: efficient fine-grain parallelism on a cluster of workstations , 1994, OSDI '94.
[41] David E. Culler,et al. A case for NOW (networks of workstation) , 1995, PODC '95.
[42] Alvin R. Lebeck,et al. Tools and techniques for memory system design and analysis , 1996 .