NetDIMM: Low-Latency Near-Memory Network Interface Architecture
暂无分享,去创建一个
[1] Patrick J. Meaney,et al. The IBM z13 memory subsystem for big data , 2015, IBM J. Res. Dev..
[2] Van Jacobson,et al. Congestion avoidance and control , 1988, SIGCOMM '88.
[3] Scott Rixner,et al. Increasing web server throughput with network interface data caching , 2002, ASPLOS X.
[4] Laxmi N. Bhuyan,et al. A new server I/O architecture for high speed networks , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[5] Mendel Rosenblum,et al. Network Interface Design for Low Latency Request-Response Protocols , 2013, USENIX ATC.
[6] Eunyoung Jeong,et al. mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.
[7] Noboru Tanabe,et al. MEMOnet: network interface plugged into a memory slot , 2000, Proceedings IEEE International Conference on Cluster Computing. CLUSTER 2000.
[8] Sameer Seth,et al. TCP/IP architecture, design, and implementation in Linux , 2008 .
[9] Ronald Minnich,et al. The memory-integrated network interface , 1995, IEEE Micro.
[10] Phillipp Bergmann,et al. Pci Express System Architecture , 2016 .
[11] Ram Huggahalli,et al. Architectural Breakdown of End-to-End Latency in a TCP/IP Network , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).
[12] Albert G. Greenberg,et al. Data center TCP (DCTCP) , 2010, SIGCOMM '10.
[13] Alex C. Snoeren,et al. Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..
[14] Michael Kagan,et al. Performance evaluation of the RDMA over ethernet (RoCE) standard in enterprise data centers infrastructure , 2011 .
[15] Ram Huggahalli,et al. Direct cache access for high bandwidth network I/O , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[16] Thomas F. Wenisch,et al. Thin servers with smart pipes: designing SoC accelerators for memcached , 2013, ISCA.
[17] David G. Andersen,et al. Design Guidelines for High Performance RDMA Systems , 2016, USENIX ATC.
[18] George Varghese,et al. Every microsecond counts: tracking fine-grain latencies with a lossy difference aggregator , 2009, SIGCOMM '09.
[19] Mingyu Chen,et al. DMA cache: Using on-chip storage to architecturally separate I/O data from CPU data for improving I/O performance , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[20] Ali G. Saidi,et al. Integrated network interfaces for high-bandwidth TCP/IP , 2006, ASPLOS XII.
[21] Andrew W. Moore,et al. Understanding PCIe performance for end host networking , 2018, SIGCOMM.
[22] Mohammad Alian,et al. Simulating PCI-Express Interconnect for Future System Exploration , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).
[23] Mohammad Alian,et al. dist-gem5: Distributed simulation of computer clusters , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[24] Wen-Fong Wang,et al. Study on enhanced strategies for TCP/IP offload engines , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).
[25] David A. Maltz,et al. Network traffic characteristics of data centers in the wild , 2010, IMC '10.
[26] Jung Ho Ahn,et al. NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[27] Mark Handley,et al. Re-architecting datacenter networks and stacks for low latency and high performance , 2017, SIGCOMM.
[28] Akira Kitamura,et al. DIMMnet-2: A Reconfigurable Board Connected Into a Memory Slot , 2006, 2006 International Conference on Field Programmable Logic and Applications.
[29] Dhabaleswar K. Panda,et al. Performance characterization of a 10-Gigabit Ethernet TOE , 2005, 13th Symposium on High Performance Interconnects (HOTI'05).
[30] Amin Vahdat,et al. Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.
[31] Daniel Firestone,et al. VFP: A Virtual Switch Platform for Host SDN in the Public Cloud , 2017, NSDI.
[32] Rachata Ausavarungnirun,et al. RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[33] Tetsuya Asai,et al. Caching memcached at reconfigurable network interface , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).
[34] Thomas E. Anderson,et al. Ingress Pipeline Queues Packet Buffer DMA PipelineDMA Egress Pipeline , 2015 .
[35] Ben Lee,et al. Platform IO DMA Transaction Acceleration , 2012 .
[36] Jinjun Xiong,et al. Application-Transparent Near-Memory Processing Architecture with Memory Channel Network , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[37] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[38] Katerina J. Argyraki,et al. ResQ: Enabling SLOs in Network Function Virtualization , 2018, NSDI.
[39] Scott Rixner,et al. Network interface data caching , 2005, IEEE Transactions on Computers.
[40] Dong Kyue Kim,et al. An Efficient Architecture for a TCP Offload Engine Based on Hardware/Software Co-design , 2011, J. Inf. Sci. Eng..
[41] Kushagra Vaid,et al. Azure Accelerated Networking: SmartNICs in the Public Cloud , 2018, NSDI.
[42] Thomas F. Wenisch,et al. Simulating DRAM controllers for future system architecture exploration , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[43] Bharat Sukhwani,et al. ConTutto – A Novel FPGA-based Prototyping Platform Enabling Innovation in the Memory Subsystem of a Server Class Processor , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[44] Jia Song,et al. Performance Review of Zero Copy Techniques , 2012 .