Network bandwidth demand in datacenters is doubling every 12 to 15 months. In response to this demand, high-bandwidth network interface cards, each capable of transferring 100s of Gigabits of data per second, are making inroads into the servers of next-generation datacenters. Such unprecedented data delivery rates on server endpoints raise new challenges, as inbound network traffic placement decisions within the memory hierarchy have a direct impact on end-to-end performance. Modern server-class Intel processors leverage DDIO technology to steer all inbound network data into the last-level cache (LLC), regardless of the network traffic’s nature. This static data placement policy is suboptimal, both from a performance and an energy efficiency standpoint. In this work, we design <monospace>IDIO</monospace>, a framework that—unlike DDIO—dynamically decides where to place inbound network traffic within a server’s multi-level memory hierarchy. <monospace>IDIO</monospace> dynamically monitors system behavior and distinguishes between different traffic classes to determine and periodically re-evaluate the best placement location for each flow: LLC, mid-level (L2) cache or DRAM. Our results show that <monospace>IDIO</monospace> increases a server’s maximum sustainable load by up to <inline-formula><tex-math notation="LaTeX">$\sim$</tex-math><alternatives> <mml:math> <mml:mo>∼</mml:mo> </mml:math> <inline-graphic xlink:href="alian-ieq1-3044923.gif"/></alternatives></inline-formula>33.3% across various network functions.
[1]
Wolfgang Kellerer,et al.
Towards Reducing Last-Level-Cache Interference of Co-Located Virtual Network Functions
,
2019,
2019 28th International Conference on Computer Communication and Networks (ICCCN).
[2]
Ram Huggahalli,et al.
Direct cache access for high bandwidth network I/O
,
2005,
32nd International Symposium on Computer Architecture (ISCA'05).
[3]
Katerina J. Argyraki,et al.
ResQ: Enabling SLOs in Network Function Virtualization
,
2018,
NSDI.
[4]
Christoforos E. Kozyrakis,et al.
Improving Resource Efficiency at Scale with Heracles
,
2016,
ACM Trans. Comput. Syst..
[5]
Babak Falsafi,et al.
The NEBULA RPC-Optimized Architecture
,
2020,
2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).