Enabling fair pricing on high performance computer systems with node sharing

Co-location, where multiple jobs share compute nodes in large-scale HPC systems, has been shown to increase aggregate throughput and energy efficiency by 10--20%. However, system operators disallow co-location due to fair-pricing concerns, i.e., a pricing mechanism that considers performance interference from co-running jobs. In the current pricing model, application execution time determines the price, which results in unfair prices paid by the minority of users whose jobs suffer from co-location.This paper presents POPPA, a runtime system that enables fair pricing by delivering precise online interference detection and facilitates the adoption of supercomputers with co-locations. POPPA leverages a novel shutter mechanism --a cyclic, fine-grained interference sampling mechanism to accurately deduce the interference between co-runners --to provide unbiased pricing of jobs that share nodes. POPPA is able to quantify inter-application interference within 4% mean absolute error on a variety of co-located benchmark and real scientific workloads.

[1]  Jie Chen,et al.  Analysis and approximation of optimal co-scheduling on Chip Multiprocessors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[2]  David Eklov,et al.  Bandwidth bandit: Understanding memory contention , 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software.

[3]  Engin Ipek,et al.  Core fusion: accommodating software diversity in chip multiprocessors , 2007, ISCA '07.

[4]  G. Edward Suh,et al.  Dynamic Partitioning of Shared Cache Memory , 2004, The Journal of Supercomputing.

[5]  Martin Schulz,et al.  Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[6]  Dean M. Tullsen,et al.  The case for colocation of high performance computing workloads , 2016, Concurr. Comput. Pract. Exp..

[7]  Dean M. Tullsen,et al.  Symbiotic jobscheduling with priorities for a simultaneous multithreading processor , 2002, SIGMETRICS '02.

[8]  Dean M. Tullsen,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.

[9]  Jichuan Chang,et al.  Cooperative cache partitioning for chip multiprocessors , 2007, ICS '07.

[10]  Michael L. Norman,et al.  Accelerating data-intensive science with Gordon and Dash , 2010 .

[11]  Ricardo Bianchini,et al.  DejaVu: accelerating resource allocation in virtualized environments , 2012, ASPLOS XVII.

[12]  Nancy Wilkins-Diehr Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery , 2013 .

[13]  Mitesh R. Meswani,et al.  Reducing Energy Usage with Memory and Computation-Aware Dynamic Frequency Scaling , 2011, Euro-Par.

[14]  Steven A. Hofmeyr,et al.  Oversubscription on multicore processors , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[15]  Jian Pei,et al.  A practical method for estimating performance degradation on multicore processors, and its application to HPC workloads , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Mahidhar Tatineni,et al.  Trestles: a high-productivity HPC system targeted to modest-scale and gateway users , 2011 .

[17]  David Eklov,et al.  Cache Pirating: Measuring the Curse of the Shared Cache , 2011, 2011 International Conference on Parallel Processing.

[18]  Jan Karel Lenstra,et al.  Scheduling subject to resource constraints: classification and complexity , 1983, Discret. Appl. Math..

[19]  Mary Lou Soffa,et al.  Contention aware execution: online contention detection and response , 2010, CGO '10.

[20]  Chao Tian,et al.  A Dynamic MapReduce Scheduler for Heterogeneous Workloads , 2009, 2009 Eighth International Conference on Grid and Cooperative Computing.

[21]  Sandeep K. S. Gupta,et al.  DASH: a Recipe for a Flash-based Data Intensive Supercomputer , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[22]  Rajkumar Buyya,et al.  Pricing Cloud Compute Commodities: A Novel Financial Economic Model , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[23]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[24]  Bingsheng He,et al.  Distributed Systems Meet Economics: Pricing in the Cloud , 2010, HotCloud.

[25]  Margo Seltzer,et al.  Cache-Fair Thread Scheduling for Multicore Processors , 2006 .

[26]  Dhabaleswar K. Panda,et al.  Reducing network contention with mixed workloads on modern multicore, clusters , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[27]  Baochun Li,et al.  Pricing cloud bandwidth reservations under demand uncertainty , 2012, SIGMETRICS '12.

[28]  Francisco J. Cazorla,et al.  Predictable performance in SMT processors , 2004, CF '04.

[29]  Xi Chen,et al.  Cache contention and application performance prediction for multi-core systems , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[30]  Martin Schulz,et al.  Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[31]  David S. Johnson,et al.  Complexity Results for Multiprocessor Scheduling under Resource Constraints , 1975, SIAM J. Comput..

[32]  Alexandra Fedorova,et al.  Contention-Aware Scheduling on Multicore Systems , 2010, TOCS.

[33]  Michael D. Smith,et al.  Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[34]  Yale N. Patt,et al.  MorphCore: An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[35]  Frank Mueller Compiler support for software-based cache partitioning , 1995 .

[36]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[37]  Chita R. Das,et al.  D-factor: a quantitative model of application slow-down in multi-resource shared systems , 2012, SIGMETRICS '12.

[38]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[39]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[40]  Jie Liu,et al.  Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines , 2011, SoCC.

[41]  Hironori Kasahara,et al.  Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing , 1984, IEEE Transactions on Computers.

[42]  Alexandra Fedorova,et al.  Towards the contention aware scheduling in HPC cluster environment , 2012 .

[43]  Uday Bondhugula,et al.  Combined iterative and model-driven optimization in an automatic parallelization framework , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[44]  David H. Bailey,et al.  The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[45]  Lingjia Tang,et al.  Directly characterizing cross core interference through contention synthesis , 2011, HiPEAC.

[46]  Dror G. Feitelson,et al.  Paired Gang Scheduling , 2003, IEEE Trans. Parallel Distributed Syst..

[47]  Eva Hocks,et al.  Gordon: design, performance, and experiences deploying and supporting a data intensive supercomputer , 2012, XSEDE '12.

[48]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[49]  Malgorzata Steinder,et al.  Performance-driven task co-scheduling for MapReduce environments , 2010, 2010 IEEE Network Operations and Management Symposium - NOMS 2010.

[50]  Martin Schulz,et al.  Enabling fair pricing on HPC systems with node sharing , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).