Fast synchronization on shared-memory multiprocessors: An architectural approach
暂无分享,去创建一个
[1] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[2] Ralph Grishman,et al. The NYU ultracomputer—designing a MIMD, shared-memory parallel machine , 2018, ISCA '98.
[3] V. Clower. Latency , 1979 .
[4] Dimitrios S. Nikolopoulos,et al. The Architectural and Operating System Implications on the Performance of Synchronization on ccNUMA Multiprocessors , 2001, International Journal of Parallel Programming.
[5] Keshav Pingali,et al. I-structures: data structures for parallel computing , 1986, Graph Reduction.
[6] Gerry Kane,et al. MIPS RISC Architecture , 1987 .
[7] Mary K. Vernon,et al. Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS III.
[8] Larry Rudolph,et al. Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors , 1983, TOPL.
[9] Cathy May,et al. The PowerPC Architecture: A Specification for a New Family of RISC Processors , 1994 .
[10] Evangelos P. Markatos,et al. The effects of multiprogramming on barrier synchronization , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.
[11] A. Gottlieb,et al. Debunking then Duplicating Ultracomputer Performance Claims by Debugging the Combining Switches , 2004 .
[12] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[13] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.
[14] Steven L. Scott,et al. Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.
[15] Burton J. Smith,et al. The Horizon supercomputing system: architecture and software , 1988, Proceedings. SUPERCOMPUTING '88.
[16] Maged M. Michael,et al. Coherence controller architectures for SMP-based CC-NUMA multiprocessors , 1997, ISCA '97.
[17] D. Burger,et al. Efficient Synchronization: Let Them Eat QOLB /sup1/ , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[18] David E. Culler,et al. Monsoon: an explicit token-store architecture , 1998, ISCA '98.
[19] Fabrizio Petrini,et al. Scalable collective communication on the ASCI Q machine , 2003, 11th Symposium on High Performance Interconnects, 2003. Proceedings..
[20] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.
[21] Rajiv Gupta,et al. A scalable implementation of barrier synchronization using an adaptive combining tree , 1990, International Journal of Parallel Programming.
[22] Òòòðð,et al. Shared-memory Mutual Exclusion: Major Research Trends Since 1986 , 1986 .
[23] Michael L. Scott,et al. Fast, contention-free combining tree barriers for shared-memory multiprocessors , 1994, International Journal of Parallel Programming.
[24] John L. Hennessy,et al. Latency, Occupancy, and Bandwidth in DSM Multiprocessors: A Performance Evaluation , 2003, IEEE Trans. Computers.
[25] Mary K. Vernon,et al. Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS 1989.
[26] Dhabaleswar K. Panda,et al. MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems , 2001, IEEE Trans. Parallel Distributed Syst..
[27] Shisheng Shang,et al. Distributed Hardwired Barrier Synchronization for Scalable Multiprocessor Clusters , 1995, IEEE Trans. Parallel Distributed Syst..
[28] Thomas E. Anderson,et al. The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..
[29] Nian-Feng Tzeng,et al. Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.
[30] Maged M. Michael,et al. Coherence Controller Architectures For Smp-based Cc-numa Multiprocessors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[31] Allan Porterfield,et al. The Tera computer system , 1990, ICS '90.
[32] Dhabaleswar K. Panda,et al. Efficient barrier using remote memory operations on VIA-based clusters , 2002, Proceedings. IEEE International Conference on Cluster Computing.
[33] James R. Goodman,et al. Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.