Grouping-Based Dynamic Power Management for Multi-threaded Programs in Chip-Multiprocessors

In the embedded systems field, the research focus has shifted from performance to considering both performance and power consumption. Previous research has investigated methods to forecast the processing behavior of programs and adopt Dynamic Voltage and Frequency Scaling (DVFS) technique to adjust the frequency of processor to meet the needs of various phase behavior of threads of programs. However few researches have paid attention to the overhead of DVFS. Generally, DVFS brings processor core unavailable time from 10 us to 650 us. Adjusting frequency for every thread may encounter unanticipated overhead especially for multi-threaded programs.The objective of this study is to take performance, power consumption and overhead into consideration and give a low overhead power management that adjusts the frequency of processor for every group of threads instead of every thread. The proposed approach consists of three works: phase behavior prediction, DVFS controlling and workload migration. To demonstrate the effect of our approach, we implemented these works on a real Linux system and compared our approach with the system without DVFS and the system with DVFS for every thread. The results present that our approach improves 15-40% power consumption with 2-10% performance penalty. Moreover, it can reduce 94-97.5% processor core unavailable time, more than the system with DVFS for every thread.

[1]  Linwei Niu,et al.  System Wide Dynamic Power Management for Weakly Hard Real-Time Systems , 2006, J. Low Power Electron..

[2]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[3]  Josep Torrellas,et al.  Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors , 2008, 2008 International Symposium on Computer Architecture.

[4]  Byoungchul Ahn,et al.  An Efficient Power-Aware Scheduling Algorithm in Real Time System , 2007, 2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[5]  R. Kotla,et al.  Characterizing the impact of different memory-intensity levels , 2004, IEEE International Workshop on Workload Characterization, 2004. WWC-7. 2004.

[6]  Rami G. Melhem,et al.  Scheduling with Dynamic Voltage/Speed Adjustment Using Slack Reclamation in Multiprocessor Real-Time Systems , 2003, IEEE Trans. Parallel Distributed Syst..

[7]  Margaret Martonosi,et al.  Long-term workload phases: duration predictions and applications to DVFS , 2005, IEEE Micro.

[8]  Gilberto Contreras,et al.  Power prediction for Intel XScale processors using performance monitoring unit events , 2005 .

[9]  Margaret Martonosi,et al.  Power prediction for Intel XScale/spl reg/ processors using performance monitoring unit events , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[10]  Rami G. Melhem,et al.  Power aware scheduling for AND/OR graphs in multiprocessor real-time systems , 2002, Proceedings International Conference on Parallel Processing.

[11]  Ramakrishna Kotla,et al.  Scheduling processor voltage and frequency in server and cluster systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[12]  D. Geer,et al.  Chip makers turn to multicore processors , 2005, Computer.

[13]  Soraya Ghiasi,et al.  Scheduling for heterogeneous processors in server systems , 2005, CF '05.

[14]  Brad Calder,et al.  Phase tracking and prediction , 2003, ISCA '03.

[15]  Rami G. Melhem,et al.  Scheduling with dynamic voltage/speed adjustment using slack reclamation in multi-processor real-time systems , 2001, Proceedings 22nd IEEE Real-Time Systems Symposium (RTSS 2001) (Cat. No.01PR1420).

[16]  Robert Love,et al.  Linux Kernel Development (2nd Edition) (Novell Press) , 2005 .

[17]  Rajiv Kapoor,et al.  Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[18]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[19]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[20]  Robert Love,et al.  Linux Kernel Development , 2003 .

[21]  Meeta Sharma Gupta,et al.  System level analysis of fast, per-core DVFS using on-chip switching regulators , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[22]  Sharad Malik,et al.  Efficient behavior-driven runtime dynamic voltage scaling policies , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[23]  Hridesh Rajan,et al.  Predictive thread-to-core assignment on a heterogeneous multi-core processor , 2007, PLOS '07.

[24]  James E. Smith,et al.  Managing multi-configuration hardware via dynamic working set analysis , 2002, ISCA.

[25]  Margaret Martonosi,et al.  Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[26]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.