Using overdecomposition to overlap communication latencies with computation and take advantage of SMT processors
暂无分享,去创建一个
[1] G. C. Fox,et al. Solving Problems on Concurrent Processors , 1988 .
[2] John Markus Bjørndalen,et al. EventSpace - Exposing and Observing Communication Behavior of Parallel Cluster Applications , 2003, Euro-Par.
[3] Dean M. Tullsen,et al. Tuning Compiler Optimizations for Simultaneous Multithreading , 2004, International Journal of Parallel Programming.
[4] Gregory A. Koenig,et al. Using message-driven objects to mask latency in grid computing applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[5] Message P Forum,et al. MPI: A Message-Passing Interface Standard , 1994 .
[6] John Markus Bjørndalen,et al. Collective Communication Performance Analysis Within the Communication System , 2004, Euro-Par.
[7] D. Marr,et al. Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .
[8] A. Snavely,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[9] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[10] Brian Vinter,et al. Java PastSet: a structured distributed shared memory system , 2003, IEE Proc. Softw..
[11] Mark A. Johnson,et al. Solving problems on concurrent processors. Vol. 1: General techniques and regular problems , 1988 .
[12] Balaram Sinharoy,et al. IBM Power5 chip: a dual-core multithreaded processor , 2004, IEEE Micro.
[13] Dean M. Tullsen,et al. Supporting fine-grained synchronization on a simultaneous multithreading processor , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[14] Philippe Roussel,et al. The microarchitecture of the intel pentium 4 processor on 90nm technology , 2004 .
[15] David A. Koufaty,et al. Hyperthreading Technology in the Netburst Microarchitecture , 2003, IEEE Micro.
[16] Renato J. O. Figueiredo,et al. Impact of heterogeneity on DSM performance , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[17] Erich M. Nahum,et al. Evaluating the impact of simultaneous multithreading on network servers using real hardware , 2005, SIGMETRICS '05.
[18] Susan J. Eggers,et al. An analysis of operating system behavior on a simultaneous multithreaded architecture , 2000, ASPLOS IX.
[19] Jack J. Dongarra,et al. A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[20] Ulrich Drepper,et al. The Native POSIX Thread Library for Linux , 2002 .
[21] Dean M. Tullsen,et al. Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading , 1997, TOCS.
[22] Dean M. Tullsen,et al. Initial observations of the simultaneous multithreading Pentium 4 processor , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[23] Brian Vinter,et al. Past-Set - A Distributed Structured Shared Memory System , 1999, HPCN Europe.