The D.A.V.I.D.E. big-data-powered fine-grain power and performance monitoring support

On the race toward exascale supercomputing systems are facing important challenges which limit the efficiency of the system. Among all, power and energy consumption fueled by the end of Dennard's scaling start to show their impact on limiting supercomputers peak performance and cost effectiveness. In this paper we present and describe a new methodology based on a set of HW and SW extensions for fine-grain monitoring of power and aggregation of them for fast analysis and visualization. We propose a turn-key system which uses MQTT communication layer, NoSQL database, fine grain monitoring and in future AI technology to measure and control power and performance. This methodology is shown as an integrated feature of the D.A.V.I.D.E. supercomputing machine.

[1]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .

[2]  Thomas Ludwig,et al.  ARDUPOWER: A low-cost wattmeter to improve energy efficiency of HPC applications , 2015, 2015 Sixth International Green and Sustainable Computing Conference (IGSC).

[3]  Luca Benini,et al.  Continuous learning of HPC infrastructure models using big data analytics and in-memory processing tools , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[4]  Thomas Ilsche,et al.  Power measurements for compute nodes: Improving sampling rates, granularity and accuracy , 2015, 2015 Sixth International Green and Sustainable Computing Conference (IGSC).

[5]  Wolfgang E. Nagel,et al.  HDEEM: High Definition Energy Efficiency Monitoring , 2014, 2014 Energy Efficient Supercomputing Workshop.

[6]  Luca Benini,et al.  Design of an Energy Aware Petaflops Class High Performance Cluster Based on Power Architecture , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[7]  Michele Magno,et al.  Evaluation of synchronization protocols for fine-grain HPC sensor data time-stamping and collection , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[8]  James H. Laros,et al.  PowerInsight - A commodity power measurement capability , 2013, 2013 International Green Computing Conference Proceedings.