TOWARDS AN AUTONOMOUS FRAMEWORK FOR HPC OPTIMIZATION: A STUDY OF PERFORMANCE PREDICTION USING HARDWARE COUNTERS AND MACHINE LEARNING

As the high processing computing becomes even more critical for scientific research across various fields, increasing performance without raising the energy consumption levels becomes an essential task in order to warrant the financial viability of exascale systems. This work presents the first step towards understanding how the many computational requirements of benchmark applications relate to the overall runtime through a machine learning model and how that can be used for the development of an autonomous framework capable of scaling applications to have an optimal trade-off between performance and energy consumption.

[1]  Horst D. Simon Barriers to Exascale Computing , 2012, VECPAR.

[2]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[3]  Arnaldo Carvalho de Melo,et al.  The New Linux ’ perf ’ Tools , 2010 .

[4]  Sven Apel,et al.  Performance-influence models for highly configurable systems , 2015, ESEC/SIGSOFT FSE.

[5]  Jeffrey S. Vetter,et al.  On the Path to Exascale , 2010, Int. J. Distributed Syst. Technol..

[6]  Xingfu Wu,et al.  Using Performance-Power Modeling to Improve Energy Efficiency of HPC Applications , 2016, Computer.

[7]  Hiroshi Motoda,et al.  Feature Selection Extraction and Construction , 2002 .

[8]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[9]  Mateo Valero,et al.  Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC? , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Dolores Rexachs del Rosario,et al.  Metodología para predecir el consumo energético de checkpoints en sistemas de HPC , 2014 .

[11]  Miriam Leeser,et al.  FIM: Performance Prediction for Parallel Computation in Iterative Data Processing Applications , 2017, 2017 IEEE 10th International Conference on Cloud Computing (CLOUD).

[12]  Jordi Torres,et al.  Towards energy-aware scheduling in data centers using machine learning , 2010, e-Energy.

[13]  Bruno Schulze,et al.  Analysis of High Performance Applications Using Workload Requirements , 2016, VECPAR.

[14]  Shivakant Mishra,et al.  Modeling CPU energy consumption for energy efficient scheduling , 2010, GCM '10.

[15]  Tom M. Mitchell,et al.  Learning by experimentation: acquiring and refining problem-solving heuristics , 1993 .

[16]  Jordi Torres,et al.  Empowering automatic data-center management with machine learning , 2013, SAC '13.

[17]  Jordi Torres,et al.  Adaptive Scheduling on Power-Aware Managed Data-Centers Using Machine Learning , 2011, 2011 IEEE/ACM 12th International Conference on Grid Computing.

[18]  James Demmel,et al.  the Parallel Computing Landscape , 2022 .

[19]  Jordi Torres,et al.  Power-Aware Multi-data Center Management Using Machine Learning , 2013, 2013 42nd International Conference on Parallel Processing.

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[21]  Philippe Olivier Alexandre Navaux,et al.  Performance Improvement of Stencil Computations for Multi-core Architectures based on Machine Learning , 2017, ICCS.

[22]  Ieee Xplore Computing in science & engineering , 1999 .

[23]  Sven Apel,et al.  Comparison of Analytical and Empirical Performance Models: A Case Study on Multigrid Systems , 2016 .

[24]  Dimitris Kanellopoulos,et al.  Data Preprocessing for Supervised Leaning , 2007 .

[25]  Paul Messina,et al.  The Exascale Computing Project , 2017, Comput. Sci. Eng..

[26]  William Gropp,et al.  Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2013, HiPC 2013.