Neural Network Meets DCN: Traffic-driven Topology Adaptation with Deep Learning

The emerging optical/wireless topology reconfiguration technologies have shown great potential in improving the performance of data center networks. However, it also poses a big challenge on how to find the best topology configurations to support the dynamic traffic demands. In this work, we present xWeaver, a traffic-driven deep learning solution to infer the high-performance network topology online. xWeaver supports a powerful network model that enables the topology optimization over different performance metrics and network architectures. With the design of properly-structured neural networks, it can automatically derive the critical traffic patterns from data traces and learn the underlying mapping between the traffic patterns and topology configurations specific to the target data center. After offline training, xWeaver generates the optimized (or near-optimal) topology configuration online, and can also smoothly update its model parameters for new traffic patterns. The experiment results show the significant performance gain of xWeaver in supporting smaller flow completion time.

[1]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[2]  Trishul M. Chilimbi,et al.  Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.

[3]  Nikhil R. Devanur,et al.  ProjecToR: Agile Reconfigurable Data Center Interconnect , 2016, SIGCOMM.

[4]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Hong Liu,et al.  Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network , 2015, Comput. Commun. Rev..

[6]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[7]  C. Raghavendra,et al.  Datacenter Traffic Control: Understanding Techniques and Tradeoffs , 2017, IEEE Communications Surveys & Tutorials.

[8]  Albert G. Greenberg,et al.  The nature of data center traffic: measurements & analysis , 2009, IMC '09.

[9]  Minlan Yu,et al.  Condor: Better Topologies Through Declarative Design , 2015, Comput. Commun. Rev..

[10]  Pramod Viswanath,et al.  Costly circuits, submodular schedules and approximate Carathéodory Theorems , 2016, Queueing Syst. Theory Appl..

[11]  T. S. Eugene Ng,et al.  A Tale of Two Topologies: Exploring Convertible Data Center Network Architectures with Flat-tree , 2017, SIGCOMM.

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[15]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[16]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[17]  Srinivasan Seshan,et al.  Scheduling techniques for hybrid circuit/packet networks , 2015, CoNEXT.

[18]  Amin Vahdat,et al.  Integrating microsecond circuit switching into the data center , 2013, SIGCOMM.

[19]  Amin Vahdat,et al.  Helios: a hybrid electrical/optical switch architecture for modular data centers , 2010, SIGCOMM '10.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[22]  J. Edmonds Paths, Trees, and Flowers , 1965, Canadian Journal of Mathematics.

[23]  Ankit Singla,et al.  OSA: An Optical Switching Architecture for Data Center Networks With Unprecedented Flexibility , 2012, IEEE/ACM Transactions on Networking.

[24]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[25]  Srinivasan Keshav,et al.  Quartz , 2014, SIGCOMM.

[26]  Konstantina Papagiannaki,et al.  c-Through: part-time optics in data centers , 2010, SIGCOMM '10.

[27]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[28]  He Liu,et al.  Circuit Switching Under the Radar with REACToR , 2014, NSDI.

[29]  Ben Y. Zhao,et al.  Mirror mirror on the ceiling: flexible wireless links for data centers , 2012, CCRV.

[30]  Wei Xu,et al.  Optimizing Bulk Transfers with Software-Defined Optical WAN , 2016, SIGCOMM.

[31]  Ming Zhang,et al.  Understanding data center traffic characteristics , 2010, CCRV.

[32]  Scott Shenker,et al.  Making Sense of Performance in Data Analytics Frameworks , 2015, NSDI.

[33]  Navendu Jain,et al.  Understanding network failures in data centers: measurement, analysis, and implications , 2011, SIGCOMM.

[34]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[35]  Haitao Wu,et al.  Enabling ECN in Multi-Service Multi-Queue Data Centers , 2016, NSDI.

[36]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[37]  Andreas Krause,et al.  Advances in Neural Information Processing Systems (NIPS) , 2014 .

[38]  Himanshu Shah,et al.  FireFly , 2014, SIGCOMM.

[39]  Xin Wang,et al.  Machine Learning for Networking: Workflow, Advances and Opportunities , 2017, IEEE Network.

[40]  Cauligi S. Raghavendra,et al.  Datacenter Traffic Control: Understanding Techniques and Tradeoffs , 2017, IEEE Communications Surveys & Tutorials.

[41]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[42]  Antony I. T. Rowstron,et al.  XFabric: A Reconfigurable In-Rack Network for Rack-Scale Computers , 2016, NSDI.

[43]  Paramvir Bahl,et al.  Augmenting data center networks with multi-gigabit wireless links , 2011, SIGCOMM 2011.

[44]  Vincent Vanhoucke,et al.  Improving the speed of neural networks on CPUs , 2011 .

[45]  Ben Y. Zhao,et al.  Cutting the cord: a robust wireless facilities network for data centers , 2014, MobiCom.

[46]  K. Obraczka Finding Low Diameter Low Edge Cost Networks , 2012 .

[47]  Alex C. Snoeren,et al.  RotorNet: A Scalable, Low-complexity, Optical Datacenter Network , 2017, SIGCOMM.

[48]  Chunming Qiao,et al.  Enabling Wide-Spread Communications on Optical Fabric with MegaSwitch , 2017, NSDI.

[49]  Xiang-Yang Li,et al.  Diamond: Nesting the Data Center Network With Wireless Rings in 3-D Space , 2016, IEEE/ACM Transactions on Networking.

[50]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[51]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..