Latency-Aware Dynamic Voltage and Frequency Scaling on Many-Core Architectures for Data-Intensive Applications

Low power is the first-class design requirement for HPC systems. Dynamic voltage and frequency scaling (DVFS) has become the commonly used and efficient technology to achieve a trade-off between power consumption and system performance. However, most the prior work using DVFS did not take into account the latency of voltage/frequency scaling, which is a critical factor in real hardware determining the power efficiency of the power management algorithm. This paper, firstly, investigate the latency features of DVFS on a real many-core hardware platform. Secondly, we propose a latency-aware DVFS algorithm for profile-based power management to avoid aggressive power state transitions. At last, we evaluate our algorithm on Intel SCC platform using a data-intensive benchmark, Graph 500 benchmark. The experimental results not only show impressive potential for energy saving in data-intensive applications (up to 31% energy saving and 60% EDP reduction), but also evaluate the efficiency of our latency-aware DVFS algorithm which achieves 12.0% extra energy saving and 5.0% extra EDP reduction, moreover, increases the execution performance by 22.4%.

[1]  Thomas F. Wenisch,et al.  CoScale: Coordinating CPU and Memory System DVFS in Server Systems , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[2]  Kai Ma,et al.  Scalable power control for many-core architectures running multi-threaded applications , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[3]  Radu Marculescu,et al.  Dynamic power management of voltage-frequency island partitioned Networks-on-Chip using Intel's Single-chip Cloud Computer , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.

[4]  John Sartori,et al.  Proactive Peak Power Management for Many-Core Architectures , 2007 .

[5]  Gernot Heiser,et al.  Dynamic voltage and frequency scaling: the laws of diminishing returns , 2010 .

[6]  Saurabh Dighe,et al.  A 48-Core IA-32 Processor in 45 nm CMOS Using On-Die Message-Passing and DVFS for Performance and Power Scaling , 2011, IEEE Journal of Solid-State Circuits.

[7]  Margaret Martonosi,et al.  Techniques for Multicore Thermal Management: Classification and New Exploration , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[8]  Akhilesh Singhania Power management in a manycore operating system , 2009 .

[9]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[10]  Qiang Xu,et al.  Learning-based power management for multi-core processors via idle period manipulation , 2012, 17th Asia and South Pacific Design Automation Conference.

[11]  Hal Wasserman,et al.  Comparing algorithm for dynamic speed-setting of a low-power CPU , 1995, MobiCom '95.

[12]  Scott Shenker,et al.  Scheduling for reduced CPU energy , 1994, OSDI '94.

[13]  Nikolas Ioannou,et al.  Phase-Based Application-Driven Hierarchical Power Management on the Single-chip Cloud Computer , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[14]  Rong Ge,et al.  Designing Computational Clusters for Performance and Power , 2007, Adv. Comput..

[15]  Frank Bellosa,et al.  Process cruise control: event-driven clock scaling for dynamic power management , 2002, CASES '02.