Towards Comprehensive Traffic Forecasting in Cloud Computing: Design and Application

In this paper, we present our effort towards comprehensive traffic forecasting for big data applications using external, light-weighted file system monitoring. Our idea is motivated by the key observations that rich traffic demand information already exists in the log and meta-data files of many big data applications, and that such information can be readily extracted through run-time file system monitoring. As the first step, we use Hadoop as a concrete example to explore our methodology and develop a system called HadoopWatch to predict traffic demands of Hadoop applications. We further implement HadoopWatch in a small-scale testbed with 10 physical servers and 30 virtual machines. Our experiments over a series of MapReduce applications demonstrate that HadoopWatch can forecast the traffic demand with almost 100% accuracy and time advance. Furthermore, it makes no modification on the Hadoop framework, and introduces little overhead to the application performance. Finally, to showcase the utility of accurate traffic prediction made by HadoopWatch, we design and implement a simple HadoopWatch-enabled network optimization module into the HadoopWatch controller, and with realistic Hadoop job benchmarks we find that even a simple algorithm can leverage the forecasting results provided by HadoopWatch to significantly improve the Hadoop job completion time by up to 14.72%.

[1]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[2]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[3]  Praveen Yalagandula,et al.  Mahout: Low-overhead datacenter traffic management using end-host-based elephant detection , 2011, 2011 Proceedings IEEE INFOCOM.

[4]  Rajeev Gandhi,et al.  Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop , 2009, HotCloud.

[5]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[6]  Antony I. T. Rowstron,et al.  Better never than late: meeting deadlines in datacenter networks , 2011, SIGCOMM.

[7]  Robbert van Renesse,et al.  Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.

[8]  Amin Vahdat,et al.  Switching the optical divide: fundamental challenges for hybrid electrical/optical datacenter networks , 2011, SoCC.

[9]  GhemawatSanjay,et al.  The Google file system , 2003 .

[10]  Haitao Wu,et al.  Explicit Path Control in Commodity Data Centers: Design and Applications , 2016, IEEE/ACM Transactions on Networking.

[11]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[12]  Brighten Godfrey,et al.  Finishing flows quickly with preemptive scheduling , 2012, CCRV.

[13]  Sujata Banerjee,et al.  DevoFlow: scaling flow management for high-performance networks , 2011, SIGCOMM 2011.

[14]  Zhiqiang Ma,et al.  HadoopWatch: A first step towards comprehensive traffic forecasting in cloud computing , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[15]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[16]  Lei Ying,et al.  Map task scheduling in MapReduce with data locality: Throughput and heavy-traffic optimality , 2013, INFOCOM.

[17]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[18]  Alex C. Snoeren,et al.  Topology Switching for Data Center Networks , 2011, Hot-ICE.

[19]  Ming Zhang,et al.  MicroTE: fine grained traffic engineering for data centers , 2011, CoNEXT '11.

[20]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[21]  Qiang Fu,et al.  Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[22]  Ion Stoica,et al.  Coflow: a networking abstraction for cluster applications , 2012, HotNets-XI.

[23]  Shriram Krishnamurthi,et al.  Participatory Networking , 2012, Hot-ICE.

[24]  Antony I. T. Rowstron,et al.  Decentralized task-aware scheduling for data center networks , 2014, SIGCOMM.

[25]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[26]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[27]  Anupam Das,et al.  Transparent and Flexible Network Management for Big Data Processing in the Cloud , 2013, HotCloud.

[28]  Xiao Zhang,et al.  CPI2: CPU performance isolation for shared compute clusters , 2013, EuroSys '13.

[29]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[30]  Anja Feldmann,et al.  A methodology for estimating interdomain web traffic demand , 2004, IMC '04.

[31]  Srikanth Kandula,et al.  Achieving high utilization with software-driven WAN , 2013, SIGCOMM.

[32]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[33]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[34]  Anees Shaikh,et al.  Programming your network at run-time for big data applications , 2012, HotSDN '12.

[35]  T. N. Vijaykumar,et al.  Deadline-aware datacenter tcp (D2TCP) , 2012, SIGCOMM '12.

[36]  Sujata Banerjee,et al.  DevoFlow: scaling flow management for high-performance networks , 2011, SIGCOMM.

[37]  Konstantina Papagiannaki,et al.  c-Through: part-time optics in data centers , 2010, SIGCOMM '10.

[38]  Haitao Wu,et al.  Towards minimal-delay deadline-driven data center TCP , 2013, HotNets.

[39]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[40]  Ion Stoica,et al.  Efficient coflow scheduling with Varys , 2014, SIGCOMM.

[41]  MahajanRatul,et al.  Achieving high utilization with software-driven WAN , 2013 .

[42]  Mikkel Thorup,et al.  Traffic engineering with estimated traffic matrices , 2003, IMC '03.

[43]  Steven Hand,et al.  CIEL: A Universal Execution Engine for Distributed Data-Flow Computing , 2011, NSDI.

[44]  Robert Love,et al.  Kernel korner: intro to inotify , 2005 .

[45]  Amin Vahdat,et al.  Helios: a hybrid electrical/optical switch architecture for modular data centers , 2010, SIGCOMM '10.

[46]  Yanpei Chen,et al.  Energy efficiency for large-scale MapReduce workloads with significant interactive analysis , 2012, EuroSys '12.

[47]  Ankit Singla,et al.  OSA: An Optical Switching Architecture for Data Center Networks With Unprecedented Flexibility , 2012, IEEE/ACM Transactions on Networking.