Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors
暂无分享,去创建一个
[1] Min Xu,et al. Evaluating Non-deterministic Multi-threaded Commercial Workloads , 2001 .
[2] D.A. Wood,et al. Reactive NUMA: A Design For Unifying S-COMA And CC-NUMA , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[3] David A. Wood,et al. Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[4] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[5] Fredrik Dahlgren. Boosting the performance of hybrid snooping cache protocols , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[6] Laxmi N. Bhuyan,et al. Design of an Adaptive Cache Coherence Protocol for Large Scale Multiprocessors , 1992, IEEE Trans. Parallel Distributed Syst..
[7] Jim Nilsson,et al. Improving performance of load-store sequences for transaction processing workloads on multiprocessors , 1999, Proceedings of the 1999 International Conference on Parallel Processing.
[8] Anna R. Karlin,et al. Competitive snoopy caching , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).
[9] David J. Lilja,et al. The Potential of Compile-Time Analysis to Adapt the Cache Coherence Enforcement Strategy to the Data Sharing Characteristics , 1995, IEEE Trans. Parallel Distributed Syst..
[10] Anna R. Karlin,et al. Two adaptive hybrid cache coherency protocols , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[11] José González,et al. Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in a cc-NUMA Architecture , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[12] BarrosoLuiz Andre,et al. Memory system characterization of commercial workloads , 1998 .
[13] Milo M. K. Martin,et al. Token Coherence: decoupling performance and correctness , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[14] Sarita V. Adve,et al. Performance of database workloads on shared-memory systems with out-of-order processors , 1998, ASPLOS VIII.
[15] Josep Torrellas,et al. Distance-adaptive update protocols for scalable shared-memory multiprocessors , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[16] Milo M. K. Martin,et al. Bandwidth adaptive snooping , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[17] Erik Hagersten,et al. WildFire: a scalable path for SMPs , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[18] Milo M. K. Martin,et al. Simulating a $ 2 M Commercial Server on a $ 2 K PC T , 2001 .
[19] Stefanos Kaxiras,et al. Improving CC-NUMA performance using Instruction-based Prediction , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[20] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[21] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[22] David A. Wood,et al. Full-system timing-first simulation , 2002, SIGMETRICS '02.
[23] Anoop Gupta,et al. Cache Invalidation Patterns in Shared-Memory Multiprocessors , 1992, IEEE Trans. Computers.
[24] K. Gharachorloo,et al. Architecture and design of AlphaServer GS320 , 2000, ASPLOS IX.
[25] Mark D. Hill,et al. Using prediction to accelerate coherence protocols , 1998, ISCA.
[26] Steven R. Kunkel,et al. System optimization for OLTP workloads , 1999, IEEE Micro.
[27] David A. Wood,et al. Multicast snooping: a new coherence method using a multicast address network , 1999, ISCA.
[28] Luiz André Barroso,et al. Memory system characterization of commercial workloads , 1998, ISCA.
[29] Milo M. K. Martin,et al. Specifying and Verifying a Broadcast and a Multicast Snooping Cache Coherence Protocol , 2002, IEEE Trans. Parallel Distributed Syst..
[30] Jim Nilsson,et al. Reducing ownership overhead for load-store sequences in cache-coherent multiprocessors , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[31] Babak Falsafi,et al. Selective, accurate, and timely self-invalidation using last-touch prediction , 2000, ISCA '00.
[32] Sigarch. Proceedings 30th Annual International Symposium on Computer Architecture , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[33] Willy Zwaenepoel,et al. Adaptive software cache management for distributed shared memory architectures , 1990, ISCA '90.
[34] Mats Brorsson,et al. An adaptive cache coherence protocol optimized for migratory sharing , 1993, ISCA '93.
[35] Josep Torrellas,et al. The memory performance of DSS commercial workloads in shared-memory multiprocessors , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.
[36] Stefanos Kaxiras,et al. Coherence communication prediction in shared-memory multiprocessors , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[37] José González,et al. The use of prediction for accelerating upgrade misses in cc-NUMA multiprocessors , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[38] Babak Falsafi,et al. Memory sharing predictor: the key to a speculative coherent DSM , 1999, ISCA.
[39] Robert J. Fowler,et al. Adaptive cache coherency for detecting migratory shared data , 1993, ISCA '93.