Simplifying and implementing service level objectives for stream parallelism

An increasing attention has been given to provide service level objectives (SLOs) in stream processing applications due to the performance and energy requirements, and because of the need to impose limits in terms of resource usage while improving the system utilization. Since the current and next-generation computing systems are intrinsically offering parallel architectures, the software has to naturally exploit the architecture parallelism. Implement and meet SLOs on existing applications is not a trivial task for application programmers, since the software development process, besides the parallelism exploitation, requires the implementation of autonomic algorithms or strategies. This is a system-oriented programming approach and requires the management of multiple knobs and sensors (e.g., the number of threads to use, the clock frequency of the cores, etc.) so that the system can self-adapt at runtime. In this work, we introduce a new and simpler way to define SLO in the application’s source code, by abstracting from the programmer all the details relative to self-adaptive system implementation. The application programmer specifies which parts of the code to parallelize and the related SLOs that should be enforced. To reach this goal, source-to-source code transformation rules are implemented in our compiler, which automatically generates self-adaptive strategies to enforce, at runtime, the user-expressed objectives. The experiments highlighted promising results with simpler, effective, and efficient SLO implementations for real-world applications.

[1]  Henrique C. M. Andrade,et al.  Fundamentals of Stream Processing: Conclusion , 2014 .

[2]  Anantha P. Chandrakasan,et al.  Minimizing power consumption in digital CMOS circuits , 1995, Proc. IEEE.

[3]  Laxmikant V. Kalé,et al.  Power Management of Extreme-Scale Networks with On/Off Links in Runtime Systems , 2015, ACM Trans. Parallel Comput..

[4]  Elaine J. Weyuker,et al.  Evaluating Software Complexity Measures , 2010, IEEE Trans. Software Eng..

[5]  Jian Li,et al.  Dynamic power-performance adaptation of parallel computation on chip multiprocessors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[6]  Yale N. Patt,et al.  Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs , 2008, ASPLOS.

[7]  Henrique C. M. Andrade,et al.  Fundamentals of Stream Processing: Frontmatter , 2014 .

[8]  Massimo Torquati,et al.  Efficient Smith-Waterman on Multi-core with FastFlow , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[9]  Dimitrios S. Nikolopoulos,et al.  Prediction-Based Power-Performance Adaptation of Multithreaded Scientific Codes , 2008, IEEE Transactions on Parallel and Distributed Systems.

[10]  R. Sturm,et al.  Foundations of Service Level Management , 2000 .

[11]  Marco Danelutto,et al.  Higher-Level Parallelism Abstractions for Video Applications with SPar , 2017, PARCO.

[12]  Marco Danelutto,et al.  Autonomic and Latency-Aware Degree of Parallelism Management in SPar , 2018, Euro-Par Workshops.

[13]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[14]  Dimitrios S. Nikolopoulos,et al.  A programming model and runtime system for significance-aware energy-efficient computing , 2015, PPOPP.

[15]  Mahmut T. Kandemir,et al.  A helper thread based EDP reduction scheme for adapting application execution in CMPs , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[16]  Henry Hoffmann,et al.  Controlling software applications via resource allocation within the heartbeats framework , 2010, 49th IEEE Conference on Decision and Control (CDC).

[17]  Laxmikant V. Kalé,et al.  Using an Adaptive HPC Runtime System to Reconfigure the Cache Hierarchy , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Sriram Rao,et al.  Dhalion: Self-Regulating Stream Processing in Heron , 2017, Proc. VLDB Endow..

[19]  Dimitrios S. Nikolopoulos,et al.  Application-Level Energy Awareness for OpenMP , 2015, IWOMP.

[20]  Geoff V. Merrett,et al.  Adaptive Energy Minimization of OpenMP Parallel Applications on Many-Core Systems , 2015, PARMA-DITAM '15.

[21]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[22]  Niall Murphy,et al.  Site Reliability Engineering: How Google Runs Production Systems , 2016 .

[23]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[24]  Betsy Beyer,et al.  Site Reliability Engineering , 2016 .

[25]  Laxmi N. Bhuyan,et al.  Thread reinforcer: Dynamically determining number of threads via OS level monitoring , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).

[26]  Marco Danelutto,et al.  Simplifying self-adaptive and power-aware computing with Nornir , 2018, Future Gener. Comput. Syst..

[27]  Una-May O'Reilly,et al.  Siblingrivalry: online autotuning through local competitions , 2012, CASES '12.

[28]  Marco Danelutto,et al.  High-Level and Productive Stream Parallelism for Dedup, Ferret, and Bzip2 , 2018, International Journal of Parallel Programming.

[29]  HoffmannHenry,et al.  Dynamic knobs for responsive power-aware computing , 2011 .

[30]  Marco Danelutto,et al.  FastFlow: High-level and Efficient Streaming on Multi-core , 2017 .

[31]  Gerhard Goos,et al.  Euro-Par 2018: Parallel Processing Workshops , 2018, Lecture Notes in Computer Science.

[32]  Luis Miguel Sánchez,et al.  Introducing Parallelism by Using REPARA C++11 Attributes , 2016, 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP).

[33]  Marco Danelutto,et al.  A Reconfiguration Algorithm for Power-Aware Parallel Applications , 2016, ACM Trans. Archit. Code Optim..

[34]  Rajkumar Buyya,et al.  Mastering Cloud Computing: Foundations and Applications Programming , 2013 .

[35]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[36]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[37]  Yixin Diao,et al.  Feedback Control of Computing Systems , 2004 .

[38]  Christine A. Shoemaker,et al.  Flicker: a dynamically adaptive architecture for power limited multicore systems , 2013, ISCA.

[39]  Marco Danelutto,et al.  Service Level Objectives via C++11 Attributes , 2018, Euro-Par Workshops.

[40]  Gurindar S. Sohi,et al.  Holistic run-time parallelism management for time and energy efficiency , 2013, ICS '13.

[41]  Henrique C. M. Andrade,et al.  Fundamentals of Stream Processing by Henrique C. M. Andrade , 2014 .

[42]  Arch D. Robison,et al.  Structured Parallel Programming: Patterns for Efficient Computation , 2012 .

[43]  Marco Danelutto,et al.  SPar: A DSL for High-Level and Productive Stream Parallelism , 2017, Parallel Process. Lett..

[44]  S NikolopoulosDimitrios,et al.  A programming model and runtime system for significance-aware energy-efficient computing , 2015 .

[45]  James Reinders,et al.  Intel® threading building blocks , 2008 .