StackMap: Low-Latency Networking with the OS Stack and Dedicated NICs

StackMap leverages the best aspects of kernel-bypass networking into a new low-latency Linux network service based on the full-featured TCP kernel implementation, by dedicating network interfaces to applications and offering an extended version of the netmap API as a zero-copy, low-overhead data path while retaining the socket API for the control path. For small-message, transactional workloads, StackMap outperforms baseline Linux by 4 to 80% in latency and 4 to 391% in throughput. It also achieves comparable performance with Seastar, a highly-optimized user-level TCP/IP stack for DPDK.

[1]  Hari Balakrishnan,et al.  Network Working Group , 1991 .

[2]  Mark Handley,et al.  The Case for Ubiquitous Transport-Level Encryption , 2010, USENIX Security Symposium.

[3]  Markku Kojo,et al.  Forward RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious Retransmission Timeouts with TCP and the Stream Control Transmission Protocol (SCTP) , 2005, RFC.

[4]  Giuseppe Lettieri,et al.  VALE, a switched ethernet for virtual machines , 2012, CoNEXT '12.

[5]  Luigi Rizzo,et al.  Transparent acceleration of software packet forwarding using netmap , 2012, 2012 Proceedings IEEE INFOCOM.

[6]  Yuchung Cheng,et al.  TCP fast open , 2011, CoNEXT '11.

[7]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[8]  Fernando Gont,et al.  Recommendations for Transport-Protocol Port Randomization , 2011, RFC.

[9]  Ryan Hamilton,et al.  QUIC: A UDP-Based Secure and Reliable Transport for HTTP/2 , 2016 .

[10]  Eddie Kohler,et al.  The Click modular router , 1999, SOSP.

[11]  Michio Honda,et al.  mSwitch: a highly-scalable, modular software switch , 2015, SOSR.

[12]  Mark Handley,et al.  Network stack specialization for performance , 2013, HotNets.

[13]  Michael Stumm,et al.  FlexSC: Flexible System Call Scheduling with Exception-Less System Calls , 2010, OSDI.

[14]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[15]  Sangjin Han,et al.  PacketShader: a GPU-accelerated software router , 2010, SIGCOMM '10.

[16]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[17]  David E. Culler,et al.  The Emergence of Networking Abstractions and Techniques in TinyOS , 2004, NSDI.

[18]  Kimmo E. E. Raatikainen,et al.  F-RTO: an enhanced recovery algorithm for TCP retransmission timeouts , 2003, CCRV.

[19]  Sally Floyd,et al.  Measuring the evolution of transport protocols in the internet , 2005, CCRV.

[20]  Wesley M. Eddy,et al.  TCP SYN Flooding Attacks and Common Mitigations , 2007, RFC.

[21]  Timothy Roscoe,et al.  Arrakis , 2014, OSDI.

[22]  Peter Corbett,et al.  Data ONTAP GX: A Scalable Storage Cluster , 2007, FAST.

[23]  Matthew Mathis,et al.  Forward acknowledgement: refining TCP congestion control , 1996, SIGCOMM '96.

[24]  Robert Tappan Morris,et al.  Improving network connection locality on multicore systems , 2012, EuroSys '12.

[25]  Christoforos E. Kozyrakis,et al.  IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.

[26]  Eunyoung Jeong,et al.  mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.

[27]  Randall R. Stewart,et al.  Improving TCP's Robustness to Blind In-Window Attacks , 2010, RFC.

[28]  Mark Handley,et al.  How Hard Can It Be? Designing and Implementing a Deployable Multipath TCP , 2012, NSDI.

[29]  Devavrat Shah,et al.  Fastpass: a centralized "zero-queue" datacenter network , 2015, SIGCOMM 2015.

[30]  Christoforos E. Kozyrakis,et al.  Reconciling high server utilization and sub-millisecond quality-of-service , 2014, EuroSys '14.

[31]  Yu Chen,et al.  Scalable Kernel TCP Design and Implementation for Short-Lived Connections , 2016, ASPLOS.

[32]  Luigi Rizzo Revisiting Network I/O APIs: The netmap Framework , 2012, ACM Queue.

[33]  Costin Raiciu,et al.  Rekindling network protocol innovation with user-level stacks , 2014, CCRV.

[34]  Ryo Nakamura,et al.  Library Operating System with Mainline Linux Network Stack , 2015 .

[35]  Matthew Mathis,et al.  Tail Loss Probe (TLP): An Algorithm for Fast Recovery of Tail Losses , 2013 .

[36]  Monia Ghobadi,et al.  Proportional rate reduction for TCP , 2011, IMC '11.

[37]  Mark Handley,et al.  TCP Extensions for Multipath Operation with Multiple Addresses , 2020, RFC.

[38]  Mark Handley,et al.  Is it still possible to extend TCP? , 2011, IMC '11.

[39]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[40]  Byung-Gon Chun,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 135 Megapipe: a New Programming Interface for Scalable Network I/o , 2022 .

[41]  Bogdan M. Wilamowski,et al.  The Transmission Control Protocol , 2005, The Industrial Information Technology Handbook.

[42]  Antti Kantee,et al.  Environmental Independence : BSD Kernel TCP / IP in Userspace Antti Kantee , 2009 .

[43]  Miljenko Mikuc,et al.  DXR: towards a billion routing lookups per second in software , 2012, CCRV.

[44]  Alan L. Cox,et al.  Lazy Asynchronous I/O for Event-Driven Servers , 2004, USENIX Annual Technical Conference, General Track.