Opportunistic flooding to improve TCP transmit performance in virtualized clouds

Virtualization is a key technology that powers cloud computing platforms such as Amazon EC2. Virtual machine (VM) consolidation, where multiple VMs share a physical host, has seen rapid adoption in practice with increasingly large number of VMs per machine and per CPU core. Our investigations, however, suggest that the increasing degree of VM consolidation has serious negative effects on the VMs' TCP transport performance. As multiple VMs share a given CPU, the scheduling latencies, which can be in the order of tens of milliseconds, substantially increase the typically sub-millisecond round-trip times (RTTs) for TCP connections in a datacenter, causing significant degradation in throughput. In this paper, we propose a light-weight solution called vFlood that (a) allows a TCP sender VM to opportunistically flood the driver domain in the same host, and (b) offloads the VM's TCP congestion control function to the driver domain in order to mask the effects of VM consolidation. Our evaluation of a vFlood prototype on Xen suggests that vFlood substantially improves TCP transmit throughput with minimal per-packet CPU overhead. Further, our application-level evaluation using Apache Olio, a web 2.0 cloud application, indicates a 33% improvement in the number of operations per second.

[1]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[2]  Amin Vahdat,et al.  Enforcing Performance Isolation Across Virtual Machines in Xen , 2006, Middleware.

[3]  Albert G. Greenberg,et al.  The nature of data center traffic: measurements & analysis , 2009, IMC '09.

[4]  A. K. Choudhury,et al.  Dynamic queue length thresholds for shared-memory packet switches , 1998, TNET.

[5]  Ada Gavrilovska,et al.  On disk I/O scheduling in virtual machines , 2010 .

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Ramana Rao Kompella,et al.  vSnoop: Improving TCP Throughput in Virtualized Environments via Acknowledgement Offload , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Willy Zwaenepoel,et al.  Optimizing TCP Receive Performance , 2008, USENIX ATC.

[9]  Anand Sivasubramaniam,et al.  Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms , 2007, VEE '07.

[10]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[11]  Willy Zwaenepoel,et al.  Diagnosing performance overheads in the xen virtual machine environment , 2005, VEE '05.

[12]  Ada Gavrilovska,et al.  Differential virtual time (DVT): rethinking I/O service differentiation for virtual machines , 2010, SoCC '10.

[13]  Jin-Soo Kim,et al.  Inter-domain socket communications supporting high performance and full binary compatibility on Xen , 2008, VEE '08.

[14]  Dhabaleswar K. Panda,et al.  Virtual machine aware communication libraries for high performance computing , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[15]  Xiaolan Zhang,et al.  XenSocket: A High-Throughput Interdomain Transport for Virtual Machines , 2007, Middleware.

[16]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[17]  A. Fox,et al.  Cloudstone : Multi-Platform , Multi-Language Benchmark and Measurement Tools for Web 2 . 0 , 2008 .

[18]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[19]  Jeffrey C. Mogul,et al.  TCP Offload Is a Dumb Idea Whose Time Has Come , 2003, HotOS.

[20]  Steven Hand,et al.  Satori: Enlightened Page Sharing , 2009, USENIX Annual Technical Conference.

[21]  Greg J. Regnier,et al.  TCP onloading for data center servers , 2004, Computer.

[22]  Alan L. Cox,et al.  Scheduling I/O in virtual machine monitors , 2008, VEE '08.

[23]  Peter J. Varman,et al.  mClock: Handling Throughput Variability for Hypervisor IO Scheduling , 2010, OSDI.

[24]  Ming Zhang,et al.  Understanding data center traffic characteristics , 2010, CCRV.

[25]  Jian Wang,et al.  XenLoop: a transparent high performance inter-VM network loopback , 2008, HPDC '08.

[26]  Garth R. Goodson,et al.  Fido: Fast Inter-Virtual-Machine Communication for Enterprise Appliances , 2009, USENIX ATC.

[27]  Alan L. Cox,et al.  Optimizing network virtualization in Xen , 2006 .

[28]  George Varghese,et al.  Difference engine , 2010, OSDI.

[29]  Willy Zwaenepoel,et al.  TwinDrivers: semi-automatic derivation of fast and safe hypervisor network drivers from guest OS drivers , 2009, ASPLOS.

[30]  Injong Rhee,et al.  CUBIC: a new TCP-friendly high-speed TCP variant , 2008, OPSR.

[31]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[32]  Larry L. Peterson,et al.  TCP Vegas: End to End Congestion Avoidance on a Global Internet , 1995, IEEE J. Sel. Areas Commun..

[33]  Cheng Jin,et al.  FAST TCP: Motivation, Architecture, Algorithms, Performance , 2006, IEEE/ACM Transactions on Networking.

[34]  Muli Ben-Yehuda,et al.  IsoStack - Highly Efficient Network Processing on Dedicated Cores , 2010, USENIX Annual Technical Conference.

[35]  Laxmi N. Bhuyan,et al.  Performance characterization and cache-aware core scheduling in a virtualized multi-core server under 10GbE , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[36]  Arun Venkataramani,et al.  Black-box and Gray-box Strategies for Virtual Machine Migration , 2007, NSDI.