IDIO: Orchestrating Inbound Network Data on Server Processors

Network bandwidth demand in datacenters is doubling every 12 to 15 months. In response to this demand, high-bandwidth network interface cards, each capable of transferring 100s of Gigabits of data per second, are making inroads into the servers of next-generation datacenters. Such unprecedented data delivery rates on server endpoints raise new challenges, as inbound network traffic placement decisions within the memory hierarchy have a direct impact on end-to-end performance. Modern server-class Intel processors leverage DDIO technology to steer all inbound network data into the last-level cache (LLC), regardless of the network traffic’s nature. This static data placement policy is suboptimal, both from a performance and an energy efficiency standpoint. In this work, we design <monospace>IDIO</monospace>, a framework that—unlike DDIO—dynamically decides where to place inbound network traffic within a server’s multi-level memory hierarchy. <monospace>IDIO</monospace> dynamically monitors system behavior and distinguishes between different traffic classes to determine and periodically re-evaluate the best placement location for each flow: LLC, mid-level (L2) cache or DRAM. Our results show that <monospace>IDIO</monospace> increases a server’s maximum sustainable load by up to <inline-formula><tex-math notation="LaTeX">$\sim$</tex-math><alternatives> <mml:math> <mml:mo>∼</mml:mo> </mml:math> <inline-graphic xlink:href="alian-ieq1-3044923.gif"/></alternatives></inline-formula>33.3% across various network functions.

[1]  Wolfgang Kellerer,et al.  Towards Reducing Last-Level-Cache Interference of Co-Located Virtual Network Functions , 2019, 2019 28th International Conference on Computer Communication and Networks (ICCCN).

[2]  Ram Huggahalli,et al.  Direct cache access for high bandwidth network I/O , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[3]  Katerina J. Argyraki,et al.  ResQ: Enabling SLOs in Network Function Virtualization , 2018, NSDI.

[4]  Christoforos E. Kozyrakis,et al.  Improving Resource Efficiency at Scale with Heracles , 2016, ACM Trans. Comput. Syst..

[5]  Babak Falsafi,et al.  The NEBULA RPC-Optimized Architecture , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).