ATP: a Datacenter Approximate Transmission Protocol

Many datacenter applications such as machine learning and streaming systems do not need the complete set of data to perform their computation. Current approximate applications in datacenters run on a reliable network layer like TCP. To improve performance, they either let sender select a subset of data and transmit them to the receiver or transmit all the data and let receiver drop some of them. These approaches are network oblivious and unnecessarily transmit more data, affecting both application runtime and network bandwidth usage. On the other hand, running approximate application on a lossy network with UDP cannot guarantee the accuracy of application computation. We propose to run approximate applications on a lossy network and to allow packet loss in a controlled manner. Specifically, we designed a new network protocol called Approximate Transmission Protocol, or ATP, for datacenter approximate applications. ATP opportunistically exploits available network bandwidth as much as possible, while performing a loss-based rate control algorithm to avoid bandwidth waste and re-transmission. It also ensures bandwidth fair sharing across flows and improves accurate applications' performance by leaving more switch buffer space to accurate flows. We evaluated ATP with both simulation and real implementation using two macro-benchmarks and two real applications, Apache Kafka and Flink. Our evaluation results show that ATP reduces application runtime by 13.9% to 74.6% compared to a TCP-based solution that drops packets at sender, and it improves accuracy by up to 94.0% compared to UDP.

[1]  Henri Casanova,et al.  High-Bandwidth Low-Latency Approximate Interconnection Networks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[2]  Mushtaq Ahmed,et al.  Contention-Based Congestion Control in Wireless Ad Hoc Networks , 2011 .

[3]  Chris Jermaine,et al.  Online aggregation for large MapReduce jobs , 2011, Proc. VLDB Endow..

[4]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[5]  Devavrat Shah,et al.  Fastpass , 2014, SIGCOMM.

[6]  Srinivasan Seshan,et al.  FCP: a flexible transport framework for accommodating diversity , 2013, SIGCOMM.

[7]  Thu D. Nguyen,et al.  ApproxHadoop: Bringing Approximations to MapReduce Frameworks , 2015, ASPLOS.

[8]  Haitao Wu,et al.  Enabling ECN in Multi-Service Multi-Queue Data Centers , 2016, NSDI.

[9]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[10]  T. N. Vijaykumar,et al.  Deadline-aware datacenter tcp (D2TCP) , 2012, CCRV.

[11]  Hong Zhang,et al.  Resilient Datacenter Load Balancing in the Wild , 2017, SIGCOMM.

[12]  George Varghese,et al.  CONGA: distributed congestion-aware load balancing for datacenters , 2015, SIGCOMM.

[13]  Luis Ceze,et al.  SAP: an Architecture for Selectively Approximate Wireless Communication , 2015, ArXiv.

[14]  Scott Shenker,et al.  Recursively Cautious Congestion Control , 2014, NSDI.

[15]  Sylvia Ratnasamy,et al.  SoftNIC: A Software NIC to Augment Hardware , 2015 .

[16]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[17]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[18]  Aviral Shrivastava,et al.  Mitigating soft error failures for multimedia applications by selective data protection , 2006, CASES '06.

[19]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[20]  Costin Raiciu,et al.  Opening Up Black Box Networks with CloudTalk , 2012, HotCloud.

[21]  Gautam Kumar,et al.  pHost: distributed near-optimal datacenter transport over commodity network fabric , 2015, CoNEXT.

[22]  Van Jacobson,et al.  BBR: Congestion-Based Congestion Control , 2016, ACM Queue.

[23]  Chundong Wang,et al.  ASAC: automatic sensitivity analysis for approximate computing , 2014, LCTES '14.

[24]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[25]  Ramana Rao Kompella,et al.  On the impact of packet spraying in data center networks , 2013, 2013 Proceedings IEEE INFOCOM.

[26]  Haitao Wu,et al.  Enabling ECN over Generic Packet Scheduling , 2016, CoNEXT.

[27]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[28]  Kathryn S. McKinley,et al.  Uncertain: a first-order type for uncertain data , 2014, ASPLOS.

[29]  Surajit Chaudhuri,et al.  Optimized stratified sampling for approximate query processing , 2007, TODS.

[30]  Li Chen,et al.  PIAS: Practical Information-Agnostic Flow Scheduling for Data Center Networks , 2014, HotNets.

[31]  David L. Black,et al.  The Addition of Explicit Congestion Notification (ECN) to IP , 2001, RFC.

[32]  Dongsu Han,et al.  Credit-Scheduled Delay-Bounded Congestion Control for Datacenters , 2017, SIGCOMM.

[33]  Robert N. M. Watson,et al.  Queues Don't Matter When You Can JUMP Them! , 2015, NSDI.

[34]  Christof Fetzer,et al.  StreamApprox: approximate computing for stream analytics , 2017, Middleware.

[35]  Christopher Ré,et al.  Approximate lineage for probabilistic databases , 2008, Proc. VLDB Endow..

[36]  Mark Handley,et al.  How Hard Can It Be? Designing and Implementing a Deployable Multipath TCP , 2012, NSDI.

[37]  Devavrat Shah,et al.  Fastpass: a centralized "zero-queue" datacenter network , 2015, SIGCOMM 2015.

[38]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[39]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[40]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[41]  Costin Raiciu,et al.  GRIN: Utilizing the Empty Half of Full Bisection Networks , 2012, HotCloud.

[42]  Dan Grossman,et al.  EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.

[43]  Haitao Wu,et al.  ICTCP: Incast Congestion Control for TCP in Data-Center Networks , 2013, IEEE/ACM Transactions on Networking.

[44]  Ion Stoica,et al.  BlinkDB: queries with bounded errors and bounded response times on very large data , 2012, EuroSys '13.

[45]  Nick McKeown,et al.  Deconstructing datacenter packet transport , 2012, HotNets-XI.