Channel reservation protocol for over-subscribed channels and destinations

Channels in system-wide networks tend to be over-subscribed due to the cost of bandwidth and increasing traffic demands. To make matters worse, workloads can overstress specific destinations, creating hotspots. Lossless networks offer attractive advantages compared to lossy networks but suffer from tree saturation. This led to the development of explicit congestion notification (ECN). However, ECN is very sensitive to its configuration parameters and acts only after congestion forms. We propose channel reservation protocol (CRP) to enable sources to reserve bandwidth in multiple resources in advance of packet transmission and with a single request, but without idling resources like circuit switching. CRP prevents congestion from ever occurring and thus reacts instantly to traffic changes, whereas ECN requires 300,000 cycles to stabilize in our experiments. Furthermore, ECN may not prevent congestion formed by short-lived flows generated by a large combination of source-destination pairs.

[1]  Fabrizio Petrini,et al.  k-ary n-trees: high performance networks for massively parallel architectures , 1997, Proceedings 11th International Parallel Processing Symposium.

[2]  Alexandre Proutière,et al.  Statistical bandwidth sharing: a study of congestion at flow level , 2001, SIGCOMM.

[3]  S. L. Scott,et al.  Using feedback to control tree saturation in multistage interconnection networks , 1989, ISCA '89.

[4]  Torsten Hoefler,et al.  Adaptive Routing Strategies for Modern High Performance Networks , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[5]  José Duato,et al.  Buffer Management Strategies to Reduce HoL Blocking , 2010, IEEE Transactions on Parallel and Distributed Systems.

[6]  Jose Renato Santos,et al.  End-to-end congestion control for infiniband , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[7]  Olav Lysne,et al.  First experiences with congestion control in InfiniBand hardware , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[8]  William J. Dally,et al.  The BlackWidow High-Radix Clos Network , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[9]  David Mazières,et al.  EyeQ: Practical Network Performance Isolation for the Multi-tenant Cloud , 2012, HotCloud.

[10]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[11]  Nan Jiang,et al.  Network congestion avoidance through Speculative Reservation , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[12]  William J. Dally,et al.  Flit-reservation flow control , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[13]  Pedro López,et al.  A family of mechanisms for congestion control in wormhole networks , 2005, IEEE Transactions on Parallel and Distributed Systems.

[14]  José Duato,et al.  A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks , 2005, 11th International Symposium on High-Performance Computer Architecture.

[15]  Hong Liu,et al.  Scaling Optical Interconnects in Datacenter Networks Opportunities and Challenges for WDM , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[16]  José Duato,et al.  Dynamic Evolution of Congestion Trees: Analysis and Impact on Switch Architecture , 2005, HiPEAC.

[17]  Anthony Skjellum,et al.  A fine-grain clock synchronization mechanism for Myrinet clusters , 2002, 27th Annual IEEE Conference on Local Computer Networks, 2002. Proceedings. LCN 2002..

[18]  Thomas F. Wenisch,et al.  Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[19]  Irfan-Ullah Awan,et al.  An Enhanced Congestion Control Mechanism in InfiniBand Networks for High Performance Computing Systems , 2006, 20th International Conference on Advanced Information Networking and Applications - Volume 1 (AINA'06).

[20]  Leonard Kleinrock,et al.  A Tradeoff Study of Switching Systems in Computer Communication Networks , 1980, IEEE Transactions on Computers.

[21]  G. Pfister,et al.  Solving Hot Spot Contention Using InfiniBand Architecture Congestion Control , 2005 .

[22]  K. K. Ramakrishnan,et al.  Proceedings of the ACM SIGCOMM 2010 conference , 2010, SIGCOMM 2010.

[23]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[24]  William J. Dally Virtual-channel flow control , 1990, ISCA '90.

[25]  Cui-Qing Yang,et al.  A taxonomy for congestion control algorithms in packet switching networks , 1995, IEEE Netw..

[26]  David L. Black,et al.  The Addition of Explicit Congestion Notification (ECN) to IP , 2001, RFC.

[27]  Andreea Anghel,et al.  Cross-layer flow and congestion control for datacenter networks , 2011 .

[28]  Christina Delimitrou,et al.  ECHO: Recreating network traffic maps for datacenters with tens of thousands of servers , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[29]  Ajay Mahimkar,et al.  Bandwidth on demand for inter-data center communication , 2011, HotNets-X.

[30]  Cyriel Minkenberg,et al.  End-to-end congestion management for non-blocking multi-stage switching fabrics , 2010, 2010 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).

[31]  Olav Lysne,et al.  dFtree: a fat-tree routing algorithm using dynamic allocation of virtual lanes to alleviate congestion in infiniband networks , 2011, NDM '11.

[32]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[33]  José Duato,et al.  Efficient, Scalable Congestion Management for Interconnection Networks , 2006, IEEE Micro.

[34]  Amin Vahdat,et al.  Scale-Out Networking in the Data Center , 2010, IEEE Micro.

[35]  Hong Liu,et al.  Energy proportional datacenter networks , 2010, ISCA.

[36]  Robert Birke,et al.  Delay-Based Cloud Congestion Control , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[37]  Marc Snir,et al.  The Performance of Multistage Interconnection Networks for Multiprocessors , 1983, IEEE Transactions on Computers.

[38]  Nan Jiang,et al.  A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[39]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[40]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[41]  C. Minkenberg,et al.  Flow and Congestion Control for Datacenter Networks , 2009 .

[42]  Antonio Robles,et al.  Congestion Management in MINs through Marked and Validated Packets , 2007, 15th EUROMICRO International Conference on Parallel, Distributed and Network-Based Processing (PDP'07).

[43]  Mithuna Thottethodi,et al.  Self-tuned congestion control for multiprocessor networks , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[44]  Timothy Mark Pinkston,et al.  Distributed resolution of network congestion and potential deadlock using reservation-based scheduling , 2005, IEEE Transactions on Parallel and Distributed Systems.

[45]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.