Optically disaggregated data centers with minimal remote memory latency: Technologies, architectures, and resource allocation [Invited]

Disaggregated rack-scale data centers have been proposed as the only promising avenue to break the barrier of the fixed CPU-to-memory proportionality caused by main-tray direct-attached conventional/traditional server-centric systems. However, memory disaggregation has stringent network requirements in terms of latency, energy efficiency, bandwidth, and bandwidth density. This paper identifies all the requirements and key performance indicators of a network to disaggregate IT resources while summarizing the progress and importance of optical interconnects. Crucially, it proposes a rack-and-cluster scale architecture, which supports the disaggregation of CPU, memory, storage, and/or accelerator blocks. Optical circuit switching forms the core of this architecture, whereas the end-points (IT resources) are equipped with on-chip programmable hybrid electrical packet/circuit switches. This architecture offers dynamically reconfigurable physical topology to form virtual ones, each embedded with a set of functions. It analyzes the latency overhead of disaggregated DDR4 (parallel) and the proposed hybrid memory cube (serial) memory elements on the conventional and the proposed architecture. A set of resource allocation algorithms are introduced to (1) optimally select disaggregated IT resources with the lowest possible latency, (2) pool them together by means of a virtual network interconnect, and (3) compose virtual disaggregated servers. Simulation findings show up to a 34% resource utilization increase over traditional data centers while highlighting the importance of the placement and locality among compute, memory, and storage resources. In particular, the network-aware locality-based resource allocation algorithm achieves as low as 15 ns, 95 ns, and 315 ns memory transaction round-trip latency on 63%, 22%, and 15% of the allocated virtual machines (VMs) accordingly while utilizing 100% of the CPU resources. Furthermore, a formulation to parameterize and evaluate the additional financial costs endured by disaggregation is reported. It is shown that the more diverse the VM requests are, the higher the net financial gain is. Finally, an experiment was carried out using silicon photonic midboard optics and an optical circuit switch, which demonstrates forward error correction free 10−12 bit error rate performance on up to five-tier scale-out networks.

[1]  Wei Chen,et al.  Experimental demonstration of 500Gbit/s short reach transmission employing PAM4 signal and direct detection with 25Gbps device , 2015, 2015 Optical Fiber Communications Conference and Exhibition (OFC).

[2]  Christoforos E. Kozyrakis,et al.  Flash storage disaggregation , 2016, EuroSys.

[3]  Chung-Sheng Li,et al.  Disaggregated and optically interconnected memory: when will it be cost effective? , 2015, ArXiv.

[4]  Jeffrey Lee,et al.  Ten-channel discrete multi-tone modulation using silicon microring modulator array , 2016, 2016 Optical Fiber Communications Conference and Exhibition (OFC).

[5]  Georgios Zervas,et al.  Network synthesis of a topology reconfigurable disaggregated rack scale datacentre for multi-tenancy , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[6]  Hong Liu,et al.  Datacenter interconnect and networking: From evolution to holistic revolution , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[7]  Xinying Li,et al.  Experimental Demonstration of Four-Channel WDM 560 Gbit/s 128QAM-DMT Using IM/DD for 2-km Optical Interconnect , 2017, Journal of Lightwave Technology.

[8]  Felix Betschon,et al.  960 Gb/s Optical Backplane Ecosystem Using Embedded Polymer Waveguides and Demonstration in a 12G SAS Storage Array , 2013, Journal of Lightwave Technology.

[9]  Mario Nemirovsky,et al.  Disaggregated Computing. An Evaluation of Current Trends for Datacentres , 2017, ICCS.

[10]  C. Schow,et al.  A 71-Gb/s NRZ Modulated 850-nm VCSEL-Based Optical Link , 2015, IEEE Photonics Technology Letters.

[11]  David V. Plant,et al.  504 and 462 Gb/s direct detect transceiver for single carrier short-reach data center applications , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[12]  M. Chagnon,et al.  Four-Dimensional Modulation and Stokes Direct Detection of Polarization Division Multiplexed Intensities, Inter Polarization Phase and Inter Polarization Differential Phase , 2016, Journal of Lightwave Technology.

[13]  Po Dong,et al.  Monolithic silicon chip with 10 modulator channels at 25 Gbps and 100-GHz spacing , 2011 .

[14]  Robert Lingle,et al.  First demonstration of PAM4 transmissions for record reach and high-capacity SWDM links over MMF using 40G/100G PAM4 IC chipset with real-time DSP , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[15]  Naoya Nishimura,et al.  28-Gb/s × 24-channel CDR-integrated VCSEL-based transceiver module for high-density optical interconnects , 2016, 2016 Optical Fiber Communications Conference and Exhibition (OFC).

[16]  Scott Shenker,et al.  Network Requirements for Resource Disaggregation , 2016, OSDI.

[17]  Salvatore Spadaro,et al.  On the benefits of resource disaggregation for virtual data centre provisioning in optical data centres , 2017, Comput. Commun..

[18]  Rao Pramod Subba,et al.  Is memory disaggregation feasible? A case study with Spark SQL , 2016, Symposium on Architectures for Networking and Communications Systems.

[19]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[20]  S. Chandrasekhar,et al.  Four-Channel 100-Gb/s Per Channel Discrete Multitone Modulation Using Silicon Photonic Integrated Circuits , 2016, Journal of Lightwave Technology.

[21]  B Zhu,et al.  Multimode transceiver for interfacing to multicore graded-index fiber capable of carrying 120-Gb/s over 100-m lengths , 2010, 2010 IEEE Photinic Society's 23rd Annual Meeting.

[22]  Michael Mesh,et al.  A 1.3 Tb/s parallel optics VCSEL link , 2014, Photonics West - Optoelectronic Materials and Devices.

[23]  Hiroshi Yamazaki,et al.  56-Gbaud 4-PAM (112-Gbit/s) operation of flip-chip interconnection lumped-electrode EADFB laser module for equalizer-free transmission , 2016, 2016 Optical Fiber Communications Conference and Exhibition (OFC).

[24]  Oded Raz,et al.  Chip Scale 12-Channel 10 Gb/s Optical Transmitter and Receiver Subassemblies Based on Wet Etched Silicon Interposer , 2017, Journal of Lightwave Technology.

[25]  Georgios Zervas,et al.  Hardware programmable network function service chain on optical rack-scale data centers , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[26]  Scott Shenker,et al.  Network support for resource disaggregation in next-generation datacenters , 2013, HotNets.

[27]  Ioannis Tomkos,et al.  A Survey on Optical Interconnects for Data Centers , 2012, IEEE Communications Surveys & Tutorials.

[28]  C. Schow,et al.  Terabit/Sec VCSEL-Based 48-Channel Optical Module Based on Holey CMOS Transceiver IC , 2013, Journal of Lightwave Technology.

[29]  Ulrich Brüning,et al.  openHMC - a configurable open-source hybrid memory cube controller , 2015, 2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig).

[30]  Nicklas Eiselt,et al.  Direct detection solutions for 100G and beyond , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[31]  Georgios Zervas,et al.  Reconfigurable computing for network function virtualization: A protocol independent switch , 2016, 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig).

[32]  Toshihiko Mori,et al.  Low crosstalk simultaneous 12 ch × 25 Gb/s operation of high-density silicon photonics multichannel receiver , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[33]  Hiroshi Yamazaki,et al.  Transmission of 214-Gbit/s 4-PAM signal using an ultra-broadband lumped-electrode EADFB laser module , 2016, 2016 Optical Fiber Communications Conference and Exhibition (OFC).

[34]  Yan Yan,et al.  All-Optical Programmable Disaggregated Data Centre Network Realized by FPGA-Based Switch and Interface Card , 2016, Journal of Lightwave Technology.

[35]  Haik Mardoyan,et al.  84-, 100-, and 107-GBd PAM-4 Intensity-Modulation Direct-Detection Transceiver for Datacenter Interconnects , 2017, Journal of Lightwave Technology.