Hiding message delivery latency using Direct-to-Cache-Transfer techniques in message passing environments
暂无分享,去创建一个
[1] Nikitas J. Dimopoulos,et al. Comparing Direct-to-Cache Transfer Policies to TCP/IP and M-VIA During Receive Operations in MPI Environments , 2007, ISPA.
[2] S. Hioki. Construction of Staples in Lattice Gauge Theory on a Parallel Computer , 1996, Parallel Comput..
[3] Liviu Iftode,et al. TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance , 2002 .
[4] Cezary Dubnicki,et al. VMMC-2 : Efficient Support for Reliable, Connection-Oriented Communication , 1997 .
[5] José González,et al. Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in a cc-NUMA Architecture , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[6] Ian Foster,et al. Parallel Spectral Transform Shallow Water Model: a runtime-tunable parallel benchmark code , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[7] Nikitas J. Dimopoulos,et al. Hiding message delivery and reducing memory access latency by providing direct-to-cache transfer during receive operations in a message passing environment , 2006, Medea.
[8] David E. Culler,et al. High-performance local area communication with fast sockets , 1997 .
[9] Nikitas J. Dimopoulos,et al. Efficient Communication Using Message Prediction for Cluster Multiprocessors , 2000, CANPC.
[10] Ram Huggahalli,et al. Direct cache access for high bandwidth network I/O , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[11] Nikitas J. Dimopoulos,et al. Lazy direct-to-cache transfer during receive operations in a message passing environment , 2006, CF '06.
[12] Thorsten von Eicken,et al. Memory management for user-level network interfaces , 1998, IEEE Micro.
[13] Greg J. Regnier,et al. The Virtual Interface Architecture , 2002, IEEE Micro.
[14] David A. Patterson,et al. Latency lags bandwith , 2004, CACM.
[15] Charles L. Seitz,et al. Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.
[16] David A. Patterson,et al. Latency Lags Bandwidth , 2005, ICCD.
[17] Dhabaleswar K. Panda,et al. MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems , 2001, IEEE Trans. Parallel Distributed Syst..
[18] Nikitas J. Dimopoulos,et al. Architectural extensions to support effcient communication using message prediction , 2002, Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications.
[19] Hsiao-Keng Jerry Chu,et al. Zero-Copy TCP in Solaris , 1996, USENIX Annual Technical Conference.
[20] Todd M. Austin,et al. SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.
[21] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[22] David J. Lilja,et al. Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs , 1998, CANPC.
[23] Ronald G. Dreslinski,et al. Performance analysis of system overheads in TCP/IP workloads , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).