Latency, Occupancy, and Bandwidth in DSM Multiprocessors: A Performance Evaluation
暂无分享,去创建一个
John L. Hennessy | Jaswinder Pal Singh | Mainak Chaudhuri | Mark Heinrich | Edward Rothberg | Chris Holt | J. Singh | J. Hennessy | M. Heinrich | E. Rothberg | Mainak Chaudhuri | C. Holt | Chris Holt
[1] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[2] Anoop Gupta,et al. Working sets, cache sizes, and node granularity issues for large-scale multiprocessors , 1993, ISCA '93.
[3] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[4] R. Karp,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[5] John L. Hennessy,et al. The performance advantages of integrating block data transfer in cache-coherent multiprocessors , 1994, ASPLOS VI.
[6] Anoop Gupta,et al. The Stanford FLASH multiprocessor , 1994, ISCA '94.
[7] Kai Li,et al. Retrospective: virtual memory mapped network interface for the SHRIMP multicomputer , 1994, ISCA '98.
[8] James R. Larus,et al. Fine-grain access control for distributed shared memory , 1994, ASPLOS VI.
[9] Michael C. Browne,et al. The S3.mp scalable shared memory multiprocessor , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.
[10] Anoop Gupta,et al. The performance impact of flexibility in the Stanford FLASH multiprocessor , 1994, ASPLOS VI.
[11] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[12] John L. Hennessy,et al. The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors , 1995 .
[13] Dennis G. Shea,et al. The SP2 High-Performance Switch , 1995, IBM Syst. J..
[14] David A. Wood,et al. Cost-Effective Parallel Computing , 1995, Computer.
[15] S.K. Reinhardt,et al. Decoupled Hardware Support for Distributed Shared Memory , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[16] D.E. Culler,et al. Effects Of Communication Latency, Overhead, And Bandwidth In A Cluster Architecture , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[17] Liviu Iftode,et al. Relaxed consistency and coherence granularity in DSM systems: a performance evaluation , 1997, PPOPP '97.
[18] Jaswinder Pal Singh,et al. The effects of communication parameters on end performance of shared virtual memory clusters , 1997, SC '97.
[19] Mike Galles. Spider: a high-speed network interconnect , 1997, IEEE Micro.
[20] Maged M. Michael,et al. Coherence Controller Architectures For Smp-based Cc-numa Multiprocessors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[21] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[22] Mary K. Vernon,et al. Analytic evaluation of shared-memory systems with ILP processors , 1998, ISCA.
[23] Donald Yeung,et al. The MIT Alewife machine: architecture and performance , 1995, ISCA '98.
[24] Rajeev Barua,et al. The sensitivity of communication mechanisms to bandwidth and latency , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.
[25] Mark Heinrich,et al. FLASH vs. (simulated) FLASH: closing the simulation loop , 2000, SIGP.
[26] Evan Speight. Providing Hardware Dsm Performance at Software Dsm Cost Providing Hardware Dsm Performance at Software Dsm Cost , 2000 .
[27] Weisong Shi,et al. A novel multicast scheme to reduce cache invalidation overheads in DSM systems , 2000, Conference Proceedings of the 2000 IEEE International Performance, Computing, and Communications Conference (Cat. No.00CH37086).
[28] Maged M. Michael,et al. High-throughput coherence controllers , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[29] K. Gharachorloo,et al. Architecture and design of AlphaServer GS320 , 2000, ASPLOS IX.