Amazon EC2 Spot Price Prediction Using Regression Random Forests

Spot instances were introduced by Amazon EC2 in December 2009 to sell its spare capacity through auction based market mechanism. Despite its extremely low prices, cloud spot market has low utilization. Spot pricing being dynamic, spot instances are prone to out-of bid failure. Bidding complexity is another reason why users today still fear using spot instances. This work aims to present Regression Random Forests (RRFs) model to predict one-week-ahead and one-day-ahead spot prices. The prediction would assist cloud users to plan in advance when to acquire spot instances, estimate execution costs, and also assist them in bid decision making to minimize execution costs and out-of-bid failure probability. Simulations with 12 months real Amazon EC2 spot history traces to forecast future spot prices show the effectiveness of the proposed technique. Comparison of RRFs based spot price forecasts with existing non-parametric machine learning models reveal that RRFs based forecast accuracy outperforms other models. We measure predictive accuracy using MAPE, MCPE, OOB Error and speed. Evaluation results show that <inline-formula><tex-math notation="LaTeX">$MAPE < = 10\% \ $</tex-math><alternatives><mml:math><mml:mrow><mml:mi>M</mml:mi><mml:mi>A</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo><</mml:mo><mml:mo>=</mml:mo><mml:mn>10</mml:mn><mml:mo>%</mml:mo><mml:mspace width="4pt"/></mml:mrow></mml:math><inline-graphic xlink:href="khandelwal-ieq1-2780159.gif"/></alternatives></inline-formula> for 66 to 92 percent and <inline-formula><tex-math notation="LaTeX">$MCPE < = 15\% \ $</tex-math><alternatives><mml:math><mml:mrow><mml:mi>M</mml:mi><mml:mi>C</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo><</mml:mo><mml:mo>=</mml:mo><mml:mn>15</mml:mn><mml:mo>%</mml:mo><mml:mspace width="4pt"/></mml:mrow></mml:math><inline-graphic xlink:href="khandelwal-ieq2-2780159.gif"/></alternatives></inline-formula> for 35 to 81 percent of one-day-ahead predictions with prediction time less than one second. <inline-formula><tex-math notation="LaTeX">$MAPE < = 15\% \ $</tex-math><alternatives><mml:math><mml:mrow><mml:mi>M</mml:mi><mml:mi>A</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo><</mml:mo><mml:mo>=</mml:mo><mml:mn>15</mml:mn><mml:mo>%</mml:mo><mml:mspace width="4pt"/></mml:mrow></mml:math><inline-graphic xlink:href="khandelwal-ieq3-2780159.gif"/></alternatives></inline-formula> for 71 to 96 percent of one-week-ahead predictions.

[1]  Snehanshu Saha,et al.  Predicting the direction of stock market prices using random forest , 2016, ArXiv.

[2]  Alípio Mário Jorge,et al.  Ensemble approaches for regression: A survey , 2012, CSUR.

[3]  Dirk Van den Poel,et al.  Predicting customer retention and profitability by using random forests and regression forests techniques , 2005, Expert Syst. Appl..

[4]  Thomas G. Dietterich,et al.  Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms , 2008 .

[5]  SoaresCarlos,et al.  Ensemble approaches for regression , 2012 .

[6]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[7]  Rajkumar Buyya,et al.  Characterizing spot price dynamics in public cloud environments , 2013, Future Gener. Comput. Syst..

[8]  Rayid Ghani,et al.  Price prediction and insurance for online auctions , 2005, KDD '05.

[9]  Shrideep Pallickara,et al.  Predictive analytics using statistical, learning, and ensemble methods to support real-time exploration of discrete event simulations , 2016, Future Gener. Comput. Syst..

[10]  Scott Fortmann-Roe,et al.  Understanding the bias-variance tradeoff , 2012 .

[11]  Muli Ben-Yehuda,et al.  Deconstructing Amazon EC2 Spot Instance Pricing , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[12]  Miao Pan,et al.  Optimal Resource Rental Planning for Elastic Applications in Cloud Market , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[13]  Volodymyr Turchenko,et al.  Applications of neural-based spot market prediction for cloud computing , 2013, 2013 IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS).

[14]  R. Preston McAfee,et al.  The wisdom of smaller, smarter crowds , 2014, EC.

[15]  Muntasir Raihan Rahman Risk Aware Resource Allocation for Clouds , 2011 .

[16]  Liang Zheng,et al.  How to Bid the Cloud , 2015, Comput. Commun. Rev..

[17]  Kaushik Dutta,et al.  Dynamic Price Prediction for Amazon Spot Instances , 2015, 2015 48th Hawaii International Conference on System Sciences.

[18]  Alagan Anpalagan,et al.  Improved short-term load forecasting using bagged neural networks , 2015 .

[19]  James Noyes,et al.  Neural Network Training , 1996 .

[20]  Diana Stralberg,et al.  Where the wild things are: predicting hotspots of seabird aggregations in the California Current System. , 2011, Ecological applications : a publication of the Ecological Society of America.

[21]  Michele Mazzucco,et al.  Achieving Performance and Availability Guarantees with Spot Instances , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.

[22]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[23]  Asser N. Tantawi,et al.  See Spot Run: Using Spot Instances for MapReduce Workflows , 2010, HotCloud.

[24]  Qianlin Liang,et al.  An Empirical Analysis of Amazon EC2 Spot Instance Features Affecting Cost-effective Resource Procurement , 2017, ICPE.

[25]  Zhongyi Hu,et al.  Interval Forecasting of Electricity Demand: A Novel Bivariate EMD-based Support Vector Regression Modeling Framework , 2014, ArXiv.

[26]  Rubén S. Montero,et al.  Cost optimization of virtual infrastructures in dynamic multi‐cloud scenarios , 2015, Concurr. Comput. Pract. Exp..

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  Evgeny A. Antipov,et al.  Mass Appraisal of Residential Apartments: An Application of Random Forest for Valuation and a CART-Based Approach for Model Diagnostics , 2010, Expert Syst. Appl..

[29]  Luís Torgo,et al.  Ensembles for Time Series Forecasting , 2014, ACML.

[30]  Richard Wolski,et al.  Providing statistical reliability guarantees in the AWS spot tier , 2016, SpringSim.

[31]  Yang Song,et al.  Optimal Bids for Spot VMs in a Cloud for Deadline Constrained Jobs , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[32]  Weimin Zheng,et al.  Bidding for Highly Available Services with Low Price in Spot Instance Market , 2015, HPDC.

[33]  Antonio Criminisi,et al.  Regression Forests for Efficient Anatomy Detection and Localization in CT Studies , 2010, MCV.

[34]  Artur Andrzejak,et al.  Decision Model for Cloud Computing under SLA Constraints , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[35]  Ash Booth,et al.  Automated trading with performance weighted random forests and seasonality , 2014, Expert Syst. Appl..

[36]  Ling Tang,et al.  A decomposition–ensemble model with data-characteristic-driven reconstruction for crude oil price forecasting , 2015 .

[37]  Paola Zuccolotto,et al.  Variable Selection Using Random Forests , 2006 .

[38]  Benjamín Barán,et al.  A Comparative Evaluation of Algorithms for Auction-Based Cloud Pricing Prediction , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).