What to expect when you are consolidating: effective prediction models of application performance on multicores

Consolidation of multiple applications with diverse and changing resource requirements is common in multicore systems as hardware resources are abundant. As opportunities for better system usage become ample, so are opportunities to degrade individual application performances due to unregulated performance interference between applications and system resources. Can we predict a performance region within which application performance is expected to lie under different consolidations? Alternatively, can we maximize resource utilization while maintaining individual application performance targets? In this work we provide a methodology that offers answers to the above difficult questions by constructing a queueing-theory based tool that can be used to accurately predict application scalability on multicores. The tool can also provide the optimal consolidation suggestions to maximize system resource utilization while meeting application performance targets. The proposed methodology is based on asymptotic analysis that can quickly provide a range of performance values that the user should expect under various consolidation scenarios. In addition, when more accurate performance forecasting is needed, the methodology can provide more accurate predictions using approximate mean value analysis. The methodology is light-weight as it relies on capturing application resource demands using standard system monitoring, via non-intrusive low-level measurements.We evaluate our approach on an IBM Power7 system using the DaCapo and SPECjvm2008 benchmark suites. From 900 different consolidations of application instances, our tool accurately predicts the average iteration time of collocated applications with an average error below 9 per cent. Experimental and analytical results are in excellent agreement, confirming the robustness of the proposed methodology in suggesting the best consolidations that meet given performance objectives of individual applications while maximizing system resource utilization.

[1]  Giuseppe Serazzi,et al.  Asymptotic Analysis of Multiclass Closed Queueing Networks: Common Bottleneck , 1996, Perform. Evaluation.

[2]  Evgenia Smirni,et al.  Burstiness in Multi-tier Applications: Symptoms, Causes, and New Models , 2008, Middleware.

[3]  Sally A. McKee,et al.  Efficiently exploring architectural design spaces via predictive modeling , 2006, ASPLOS XII.

[4]  R TallentNathan,et al.  Effective performance measurement and analysis of multithreaded applications , 2009 .

[5]  Haibo Chen,et al.  A case for scaling applications to many-core with OS clustering , 2011, EuroSys '11.

[6]  Alexandra Fedorova,et al.  Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS XV.

[7]  Calton Pu,et al.  An Analysis of Performance Interference Effects in Virtual Environments , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[8]  Muli Ben-Yehuda,et al.  Applications Know Best: Performance-Driven Memory Overcommit with Ginkgo , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[9]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[10]  Asit K. Mishra,et al.  METE: meeting end-to-end QoS in multicores through system-wide resource management , 2011, PERV.

[11]  Tong Li,et al.  Using OS Observations to Improve Performance in Multicore Systems , 2008, IEEE Micro.

[12]  L. John,et al.  Modeling program resource demand using inherent program characteristics , 2011, PERV.

[13]  Mary Lou Soffa,et al.  Characterizing multi-threaded applications based on shared-resource contention , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.

[14]  Qi Zhang,et al.  A regression-based analytic model for capacity planning of multi-tier applications , 2008, Cluster Computing.

[15]  Xiaomin Zhang,et al.  Characterization & analysis of a server consolidation benchmark , 2008, VEE '08.

[16]  David M. Brooks,et al.  CPR: Composable performance regression for scalable multiprocessor models , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[17]  Nathan R. Tallent,et al.  Effective performance measurement and analysis of multithreaded applications , 2009, PPoPP '09.

[18]  Virgílio A. F. Almeida,et al.  Capacity Planning and Performance Modeling: From Mainframes to Client-Server Systems , 1994 .

[19]  Jie Liu,et al.  Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines , 2011, SoCC.

[20]  Evgenia Smirni,et al.  Achieving application-centric performance targets via consolidation on multicores: myth or reality? , 2012, HPDC '12.

[21]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[22]  Evgenia Smirni,et al.  Model-driven consolidation of Java workloads on multicores , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).

[23]  Mahmut T. Kandemir,et al.  METE: meeting end-to-end QoS in multicores through system-wide resource management , 2011, SIGMETRICS.

[24]  Stephen S. Lavenberg,et al.  Mean-Value Analysis of Closed Multichain Queuing Networks , 1980, JACM.

[25]  Arun Venkataramani,et al.  Sandpiper: Black-box and gray-box resource management for virtual machines , 2009, Comput. Networks.

[26]  Natalie D. Enright Jerger,et al.  An Evaluation of Server Consolidation Workloads for Multi-Core Designs , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[27]  Alexandra Fedorova,et al.  Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS 2010.

[28]  Prashant J. Shenoy,et al.  Profiling and Modeling Resource Usage of Virtualized Applications , 2008, Middleware.

[29]  Asser N. Tantawi,et al.  An analytical model for multi-tier internet services and its applications , 2005, SIGMETRICS '05.

[30]  Eric Bouillet,et al.  Efficient resource provisioning in compute clouds via VM multiplexing , 2010, ICAC '10.

[31]  Anand Sivasubramaniam,et al.  Consolidating clients on back-end servers with co-location and frequency control , 2006, SIGMETRICS '06/Performance '06.

[32]  Lester Lipsky,et al.  On the asymptotic behavior of time-sharing systems , 1982, CACM.

[33]  Matthias Hauswirth,et al.  Vertical profiling: understanding the behavior of object-priented applications , 2004, OOPSLA.