Large Data Transfer Predictability and Forecasting using Application-Aware SDN

Network management for applications that rely on large-scale data transfers is challenging due to the volatility and the dynamic nature of the access traffic patterns. Predictive analytics and forecasting play an important role in providing effective resource allocation strategies for large data transfers. We propose a predictive analytics solution for large data transfers using an application-aware software defined networking (SDN) approach. We perform extensive exploratory data analysis to characterize the GridFTP connection transfers dataset and present various strategies for its use with statistical forecasting models. We develop a univariate autoregressive integrated moving average (ARIMA) based prediction framework for forecasting GridFTP connection transfers. Our prediction model tightly integrates with an application-aware SDN solution to preemptively drive network management decisions for GridFTP resource allocation at a U.S. CMS Tier-2 site. Further, our framework has a mean absolute percentage error (MAPE) ranging from 6% to 10% when applied to make rolling forecasts.

[1]  Rob J Hyndman,et al.  Minimum Sample Size requirements for Seasonal Forecasting Models , 2007 .

[2]  D. Bonacorsi,et al.  The CMS Computing Model , 2007 .

[3]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[4]  Abiola Adegboyega An Adaptive Score Model for Effective Bandwidth Prediction and Provisioning in the Cloud Network , 2015, 2015 IEEE Globecom Workshops (GC Wkshps).

[5]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[6]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[7]  Thilo Kielmann,et al.  Autoscaling Web Applications in Heterogeneous Cloud Infrastructures , 2014, 2014 IEEE International Conference on Cloud Engineering.

[8]  Byrav Ramamurthy,et al.  Differentiated network services for data-intensive science using application-aware SDN , 2017, 2017 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS).

[9]  Joshua R. Smith,et al.  LIGO: the Laser Interferometer Gravitational-Wave Observatory , 1992, Science.

[10]  Steven Izzo,et al.  How will NFV/SDN transform service provider opex? , 2015, IEEE Network.

[11]  Baochun Li,et al.  Quality-assured cloud bandwidth auto-scaling for video-on-demand applications , 2012, 2012 Proceedings IEEE INFOCOM.

[12]  James G. MacKinnon,et al.  Critical Values for Cointegration Tests , 1990 .

[13]  K. Chandra Sekaran,et al.  An Approach for Dynamic Scaling of Resources in Enterprise Cloud , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[14]  Harry G. Perros,et al.  Scheduling cloud capacity for Time- Varying customer demand , 2012, 2012 IEEE 1st International Conference on Cloud Networking (CLOUDNET).

[15]  Lars Grunske,et al.  An Approach to Forecasting QoS Attributes of Web Services Based on ARIMA and GARCH Models , 2012, 2012 IEEE 19th International Conference on Web Services.

[16]  Byrav Ramamurthy,et al.  SNAG : SDN-managed Network Architecture for GridFTP Transfers I , 2016 .

[17]  Aniruddha S. Gokhale,et al.  Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting , 2011, 2011 IEEE 4th International Conference on Cloud Computing.