Low-margin optical networking at cloud scale [Invited]

Every day, customers across the globe connect to cloud service provider servers with requests for diverse types of data, requiring instantaneous response times and seamless availability. The physical infrastructure which underpins those services is based on optics and optical networks, with the focus of this paper being on Microsoft’s approach to the optical network. Maintaining a global optical networking infrastructure which meets these customer needs means Microsoft must utilize solutions which are highly tailored and optimized for the application space which they address, with appropriately streamlined solutions for metropolitan data center interconnect and long-haul portions of the network. This paper presents Microsoft’s approach for tackling these challenges at cloud scale, highlighting the low-margin solutions which are employed. We provide a survey of Microsoft’s regional network design and corresponding optical network architectures, and present volumes of real-time polled metrics from the thousands of lines systems and tens of thousands of transceivers deployed today. We close by describing our approach to a unified software-defined networking toolset which ultimately enables the velocity and scale with which we can grow and operate this critical network infrastructure.

[1]  Anees Shaikh,et al.  Data Models for Optical Devices in Data Center Operator Networks , 2019, 2019 Optical Fiber Communications Conference and Exhibition (OFC).

[2]  R. Theodore Hofmeister,et al.  Lessons learned from open line system deployments , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[3]  Francesco Musumeci,et al.  Machine-Learning-Based Soft-Failure Detection and Identification in Optical Networks , 2018, 2018 Optical Fiber Communications Conference and Exposition (OFC).

[4]  Piero Castoldi,et al.  BER Degradation Detection and Failure Identification in Elastic Optical Networks , 2017, Journal of Lightwave Technology.

[5]  George N. Rouskas,et al.  SDN enabled restoration with triggered precomputation in elastic optical inter-datacenter networks , 2018, IEEE/OSA Journal of Optical Communications and Networking.

[6]  Radhakrishnan Nagarajan,et al.  Demonstration and performance analysis of 4 Tb/s DWDM metro-DCI system with 100G PAM4 QSFP28 modules , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[7]  Ratul Mahajan,et al.  Elastic optical networking in the microsoft cloud [Invited] , 2016, IEEE/OSA Journal of Optical Communications and Networking.

[8]  Biswanath Mukherjee,et al.  Disaster-aware datacenter placement and dynamic content management in cloud networks , 2015, IEEE/OSA Journal of Optical Communications and Networking.

[9]  Mark Filer,et al.  Lessons learned from CFP2-ACO system integrations, interoperability testing and deployments , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[10]  Mark Filer,et al.  Transmission performance of layer-2/3 modular switch with mQAM coherent ASIC and CFP2-ACOs over flex-grid OLS with 104 channels spaced 37.5 GHz , 2017, 2017 Optical Fiber Communications Conference and Exhibition (OFC).

[11]  Vittorio Curri,et al.  Multi-Vendor Experimental Validation of an Open Source QoT Estimator for Optical Networks , 2018, Journal of Lightwave Technology.

[12]  Achim Autenrieth,et al.  Cognitive Assurance Architecture for Optical Network Fault Management , 2018, Journal of Lightwave Technology.

[13]  V. Curri,et al.  Physical layer performance of multi-band optical line systems using raman amplification , 2019, IEEE/OSA Journal of Optical Communications and Networking.

[14]  Anees Shaikh,et al.  Optical Zero Touch Networking — A Large Operator Perspective , 2019, 2019 Optical Fiber Communications Conference and Exhibition (OFC).

[15]  P. Poggiolini The GN Model of Non-Linear Propagation in Uncompensated Coherent Optical Systems , 2012, Journal of Lightwave Technology.

[16]  Ricard Vilalta,et al.  Control, Management, and Orchestration of Optical Networks: Evolution, Trends, and Challenges , 2018, Journal of Lightwave Technology.

[17]  Gabriella Bosco,et al.  EGN model of non-linear fiber propagation. , 2014, Optics express.

[18]  Mark Filer,et al.  Toward transport ecosystem interoperability enabled by vendor-diverse coherent optical sources over an open line system , 2018, IEEE/OSA Journal of Optical Communications and Networking.

[19]  P. Bayvel,et al.  Investigation of bandwidth loading in optical fibre transmission using amplified spontaneous emission noise. , 2017, Optics express.

[20]  R. Theodore Hofmeister,et al.  Scalable and flexible transport networks for inter-datacenter connectivity , 2015, 2015 Optical Fiber Communications Conference and Exhibition (OFC).

[21]  Thomas Richter,et al.  Comparison of WDM Bandwidth Loading Using Individual Transponders, Shaped, and Flat ASE Noise , 2018, 2018 Optical Fiber Communications Conference and Exposition (OFC).

[22]  Anees Shaikh,et al.  Optical Network Control & Management Plane Evolution — A Large Datacenter Operator Perspective , 2019, 2019 Optical Fiber Communications Conference and Exhibition (OFC).

[23]  Dimitra Simeonidou,et al.  Evaluating availability of optical networks based on self-healing network function programmable ROADMs , 2014, IEEE/OSA Journal of Optical Communications and Networking.

[24]  Radhakrishnan Nagarajan,et al.  Silicon photonics-based 100 Gbit/s, PAM4, DWDM data center interconnects , 2018, IEEE/OSA Journal of Optical Communications and Networking.

[25]  Biswanath Mukherjee,et al.  Minimizing the Risk From Disaster Failures in Optical Backbone Networks , 2014, Journal of Lightwave Technology.

[26]  E. Pincemin,et al.  Silicon Photonic Multi-Rate DCO-CFP2 Interface for DCI, Metro, and Long-Haul Optical Communications , 2018, 2018 Optical Fiber Communications Conference and Exposition (OFC).