HyperPlane: A Scalable Low-Latency Notification Accelerator for Software Data Planes
暂无分享,去创建一个
Thomas F. Wenisch | Amirhossein Mirhosseini | Hossein Golestani | T. Wenisch | Amirhossein Mirhosseini | Hossein Golestani
[1] Michael Ferdman,et al. Taming the Killer Microsecond , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[2] Ren Wang,et al. HALO: Accelerating Flow Classification for Scalable Packet Processing in NFV , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[3] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[4] Gang Cao,et al. SPDK Vhost-NVMe: Accelerating I/Os in Virtual Machines on NVMe SSDs via User Space Vhost Target , 2018, 2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2).
[5] Yuan He,et al. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems , 2019, ASPLOS.
[6] Michael L. Scott,et al. Hodor: Intra-Process Isolation for High-Throughput Data Plane Libraries , 2019, USENIX Annual Technical Conference.
[7] Gerald Q. Maguire,et al. RSS++: load and state-aware receive side scaling , 2019, CoNEXT.
[8] H. T. Kung,et al. A Regular Layout for Parallel Adders , 1982, IEEE Transactions on Computers.
[9] Nan Hua,et al. Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization , 2018, NSDI.
[10] Timothy Roscoe,et al. Arrakis , 2014, OSDI.
[11] Christoforos E. Kozyrakis,et al. ReFlex: Remote Flash ≈ Local Flash , 2017, ASPLOS.
[12] Cheng-Chew Lim,et al. Parallel prefix adder design , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.
[13] Nick McKeown,et al. Designing and implementing a fast crossbar scheduler , 1999, IEEE Micro.
[14] Shubhendu S. Mukherjee,et al. Coherent Network Interfaces for Fine-Grain Communication , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[15] Gerald Q. Maguire,et al. Make the Most out of Last Level Cache in Intel Processors , 2019, EuroSys.
[16] Kris Gaj,et al. A novel modular adder for one thousand bits and more using fast carry chains of modern FPGAs , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).
[17] David Zhang,et al. Secure program execution via dynamic information flow tracking , 2004, ASPLOS XI.
[18] Jing Liu,et al. I'm Not Dead Yet!: The Role of the Operating System in a Kernel-Bypass Era , 2019, HotOS.
[19] Baochun Li,et al. Erasure coding for cloud storage systems: A survey , 2013 .
[20] Karan Gupta,et al. Offloading distributed applications onto smartNICs using iPipe , 2019, SIGCOMM.
[21] Christoforos E. Kozyrakis,et al. Raksha: a flexible information flow architecture for software security , 2007, ISCA '07.
[22] Hari Balakrishnan,et al. Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads , 2019, NSDI.
[23] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[24] Yan Solihin,et al. Architecture Support for Improving Bulk Memory Copying and Initialization Performance , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[25] Raouf Boutaba,et al. Re-Architecting NFV Ecosystem with Microservices: State of the Art and Research Challenges , 2019, IEEE Network.
[26] Kushagra Vaid,et al. Azure Accelerated Networking: SmartNICs in the Public Cloud , 2018, NSDI.
[27] David A. Patterson,et al. Attack of the killer microseconds , 2017, Commun. ACM.
[28] Thomas F. Wenisch,et al. Disaggregated memory for expansion and sharing in blade servers , 2009, ISCA '09.
[29] Thomas F. Wenisch,et al. µTune: Auto-Tuned Threading for OLDI Microservices , 2018, OSDI.
[30] David A. Wood,et al. A Primer on Memory Consistency and Cache Coherence , 2012, Synthesis Lectures on Computer Architecture.
[31] Thomas F. Wenisch,et al. μ Suite: A Benchmark Suite for Microservices , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).
[32] James R. Goodman,et al. Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.
[33] Rajiv Gupta,et al. ECMon: exposing cache events for monitoring , 2009, ISCA '09.
[34] Thomas F. Wenisch,et al. The Queuing-First Approach for Tail Management of Interactive Services , 2019, IEEE Micro.
[35] Wei Liu,et al. iWatcher: efficient architectural support for software debugging , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[36] Christoforos E. Kozyrakis,et al. Shinjuku: Preemptive Scheduling for μsecond-scale Tail Latency , 2019, NSDI.
[37] Christoforos E. Kozyrakis,et al. IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.
[38] Eunyoung Jeong,et al. mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.
[39] Carsten Binnig,et al. The End of Slow Networks: It's Time for a Redesign , 2015, Proc. VLDB Endow..
[40] Donald Yeung,et al. Transparent threads: resource sharing in SMT processors for high single-thread performance , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[41] Akshitha Sriraman,et al. Accelerometer: Understanding Acceleration Opportunities for Data Center Overheads at Hyperscale , 2020, ASPLOS.
[42] Rasmus Pagh,et al. Cuckoo Hashing , 2001, Encyclopedia of Algorithms.
[43] Marcos K. Aguilera,et al. Remote memory in the age of fast networks , 2017, SoCC.
[44] Traviss. Craig,et al. Building FIFO and Priority-Queuing Spin Locks from Atomic Swap , 1993 .
[45] Ricardo Bianchini,et al. LeapIO: Efficient and Portable Virtual NVMe Storage on ARM SoCs , 2020, ASPLOS.
[46] Thomas F. Wenisch,et al. Enhancing Server Efficiency in the Face of Killer Microseconds , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[47] Gadi Taubenfeld. Shared Memory Synchronization , 2008, Bull. EATCS.
[48] Edouard Bugnion,et al. ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks , 2017, SOSP.
[49] Babak Falsafi,et al. RPCValet: NI-Driven Tail-Aware Balancing of µs-Scale RPCs , 2019, ASPLOS.
[50] Thomas F. Wenisch,et al. Thermostat: Application-transparent Page Management for Two-tiered Main Memory , 2017, ASPLOS.
[51] Guru Venkataramani,et al. MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[52] Michael L. Scott,et al. Scalable reader-writer synchronization for shared-memory multiprocessors , 1991, PPOPP '91.
[53] Mehdi Baradaran Tahoori,et al. ExtraTime: Modeling and analysis of wearout due to transistor aging at microarchitecture-level , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).
[54] KyoungSoo Park,et al. APUNet: Revitalizing GPU as Packet Processing Accelerator , 2017, NSDI.
[55] Amin Vahdat,et al. Snap: a microkernel approach to host networking , 2019, SOSP.
[56] Jack L. Lo,et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[57] Hari Angepat,et al. A cloud-scale acceleration architecture , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[58] Michael Werner,et al. Wake-up latencies for processor idle states on current x86 processors , 2014, Computer Science - Research and Development.
[59] Wolfgang Schröder-Preikschat,et al. Sleepy Sloth: Threads as Interrupts as Threads , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.
[60] Sheila Frankel,et al. The AES-CBC Cipher Algorithm and Its Use with IPsec , 2003, RFC.
[61] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[62] Yongqiang Xiong,et al. ClickNP: Highly Flexible and High Performance Network Processing with Reconfigurable Hardware , 2016, SIGCOMM.
[63] H. Fatih Ugurdag,et al. Fast parallel prefix logic circuits for n2n round-robin arbitration , 2012, Microelectron. J..
[64] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[65] Srinivasan Seshan,et al. FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds , 2019, NSDI.
[66] Tong Li,et al. Spin detection hardware for improved management of multithreaded systems , 2006, IEEE Transactions on Parallel and Distributed Systems.
[67] Katerina J. Argyraki,et al. ResQ: Enabling SLOs in Network Function Virtualization , 2018, NSDI.
[68] Quinn Jacobson,et al. Disintermediated Active Communication , 2006, IEEE Computer Architecture Letters.
[69] Rachid Guerraoui,et al. Unlocking Energy , 2016, USENIX Annual Technical Conference.
[70] Garth A. Gibson,et al. RAID: high-performance, reliable secondary storage , 1994, CSUR.
[71] Babak Falsafi,et al. Optimus Prime: Accelerating Data Transformation in Servers , 2020, ASPLOS.
[72] Christoforos E. Kozyrakis,et al. The ZCache: Decoupling Ways and Associativity , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[73] Thomas F. Wenisch,et al. Express-Lane Scheduling and Multithreading to Minimize the Tail Latency of Microservices , 2019, 2019 IEEE International Conference on Autonomic Computing (ICAC).
[74] Erik Hagersten,et al. Queue locks on cache coherent multiprocessors , 1994, Proceedings of 8th International Parallel Processing Symposium.
[75] Thomas E. Anderson,et al. Ingress Pipeline Queues Packet Buffer DMA PipelineDMA Egress Pipeline , 2015 .
[76] Thomas F. Wenisch,et al. Software Data Planes: You Can't Always Spin to Win , 2019, SoCC.
[77] David G. Andersen,et al. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs , 2016, OSDI.
[78] Dino Farinacci,et al. Generic Routing Encapsulation (GRE) , 2000, RFC.
[79] Michio Honda,et al. PASTE: A Network Programming Interface for Non-Volatile Main Memory , 2018, NSDI.
[80] HölzleUrs,et al. The Case for Energy-Proportional Computing , 2007 .
[81] No License,et al. Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .