Transparent and Flexible Network Management for Big Data Processing in the Cloud

We introduce FlowComb, a network management framework that helps Big Data processing applications, such as Hadoop, achieve high utilization and low data processing times. FlowComb predicts application network transfers, sometimes before they start, by using software agents installed on application servers and while remaining completely transparent to the application. A centralized decision engine collects data movement information from agents and schedules upcoming flows on paths such that the network does not become congested. Results on our lab testbed show that FlowComb is able to reduce the time to sort 10GB of randomly generated data by 35% while changing paths for only 6% of the transfers.

[1]  Harsha V. Madhyastha,et al.  FlowSense: Monitoring Network Utilization with Zero Measurement Cost , 2013, PAM.

[2]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[3]  Albert G. Greenberg,et al.  Sharing the Data Center Network , 2011, NSDI.

[4]  Konstantina Papagiannaki,et al.  c-Through: part-time optics in data centers , 2010, SIGCOMM 2010.

[5]  Haitao Wu,et al.  BCube: a high performance, server-centric network architecture for modular data centers , 2009, SIGCOMM '09.

[6]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[7]  Martín Casado,et al.  NOX: towards an operating system for networks , 2008, CCRV.

[8]  Amin Vahdat,et al.  Helios: a hybrid electrical/optical switch architecture for modular data centers , 2010, SIGCOMM '10.

[9]  Ankit Singla,et al.  OSA: An Optical Switching Architecture for Data Center Networks With Unprecedented Flexibility , 2012, IEEE/ACM Transactions on Networking.

[10]  Amin Vahdat,et al.  Switching the optical divide: fundamental challenges for hybrid electrical/optical datacenter networks , 2011, SoCC.

[11]  Monia Ghobadi,et al.  OpenTM: Traffic Matrix Estimator for OpenFlow Networks , 2010, PAM.

[12]  Antony I. T. Rowstron,et al.  Camdoop: Exploiting In-network Aggregation for Big Data Applications , 2012, NSDI.

[13]  Michael I. Jordan,et al.  Managing data transfers in computer clusters with orchestra , 2011, SIGCOMM.

[14]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[15]  Srinivasan Seshan,et al.  Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems , 2008, FAST.

[16]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[17]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[18]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[19]  Yanpei Chen,et al.  Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads , 2012, Proc. VLDB Endow..

[20]  Anees Shaikh,et al.  Programming your network at run-time for big data applications , 2012, HotSDN '12.

[21]  Antony Rowstron,et al.  Symbiotic routing in future data centers , 2010, SIGCOMM 2010.