A Novel Power Management for CMP Systems in Data-Intensive Environment

The emerging data-intensive applications of today are comprised of non-uniform CPU and I/O intensive workloads, thus imposing a requirement to consider both CPU and I/O effects in the power management strategies. Only scaling down the processor's frequency based on its busy/idle ratio cannot fully exploit opportunities of saving power. Our experiments show that besides the busy and idle status, each processor may also have I/O wait phases waiting for I/O operations to complete. During this period, the completion time is decided by the I/O subsystem rather than the CPU thus scaling the processor to a lower frequency will not affect the performance but save more power. In addition, the CPU's reaction to the I/O operations may be significantly affected by several factors, such as I/O type (sync or unsync), instruction/job level parallelism, it cannot be accurately modeled via physics laws like mechanical or chemical systems. In this paper, we propose a novel power management scheme called MAR (modeless, adaptive, rule-based) in multiprocessor systems to minimize the CPU power consumption under performance constraints. By using richer feedback factors, e.g. the I/O wait, MAR is able to accurately describe the relationships among core frequencies, performance and power consumption. We adopt a modeless control model to reduce the complexity of system modeling. MAR is designed for CMP (Chip Multi Processor) systems by employing multi-input/multi-output (MIMO) theory and per core level DVFS (Dynamic Voltage and Frequency Scaling). Our extensive experiments on a physical test bed demonstrate that, for the SPEC benchmark and data-intensive (TPC-C) benchmark, the efficiency of MAR is 93.6-96.2\% accurate to the ideal power saving strategy calculated off-line. Compared with baseline solutions, MAR could save 22.5-32.5\% more power while keeping the comparable performance loss of about 1.8-2.9\%. In addition, simulation results show the efficiency of our design for various CMP configurations.

[1]  Tong Shao-cheng,et al.  Fuzzy adaptive observer backstepping control for MIMO nonlinear systems , 2009 .

[2]  Alan L. Cox,et al.  Lazy Asynchronous I/O for Event-Driven Servers , 2004, USENIX Annual Technical Conference, General Track.

[3]  Sharad Malik,et al.  Bounds on power savings using runtime dynamic voltage scaling: an exact algorithm and a linear-time heuristic approximation , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[4]  Michael D. Smith,et al.  Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[5]  Rong Ge,et al.  CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[6]  Yongming Li,et al.  Fuzzy adaptive observer backstepping control for MIMO nonlinear systems , 2009, Fuzzy Sets Syst..

[7]  Xiaorui Wang,et al.  Cluster-level feedback power control for performance optimization , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[8]  Magnus,et al.  Linux Kernel Internals with Cdrom , 1997 .

[9]  Shaocheng Tong,et al.  Observer-based fuzzy adaptive control for strict-feedback nonlinear systems , 2009, Fuzzy Sets Syst..

[10]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[11]  Sadaf R. Alam,et al.  Performance characteristics of biomolecular simulations on high-end systems with multi-core processors , 2008, Parallel Comput..

[12]  Chenyang Lu,et al.  Feedback utilization control in distributed real-time systems with end-to-end tasks , 2005, IEEE Transactions on Parallel and Distributed Systems.

[13]  Sarma B. K. Vrudhula,et al.  Throughput optimal task allocation under thermal constraints for multi-core processors , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[14]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[15]  Chenyang Lu,et al.  DEUCON: Decentralized End-to-End Utilization Control for Distributed Real-Time Systems , 2007, IEEE Transactions on Parallel and Distributed Systems.

[16]  Massoud Pedram,et al.  Fine-Grained Dynamic Voltage and Frequency Scaling for Precise Energy and Performance Trade-Off Based on the Ratio of Off-Chip Access to On-Chip Computation Times , 2004, DATE.

[17]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[18]  Venkatram Vishwanath,et al.  Accelerating tropical cyclone analysis using LambdaRAM, a distributed data cache over wide-area ultra-fast networks , 2009, Future Gener. Comput. Syst..

[19]  Mahmut T. Kandemir,et al.  Energy savings through embedded processing on disk system , 2006, Asia and South Pacific Conference on Design Automation, 2006..

[20]  Xue Liu,et al.  Robust fuzzy CPU utilization control for dynamic workloads , 2010, J. Syst. Softw..

[21]  Bruce Jacob,et al.  A control-theoretic approach to dynamic voltage scheduling , 2003, CASES '03.

[22]  Margaret Martonosi,et al.  Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[23]  Margaret Martonosi,et al.  A dynamic compilation framework for controlling microprocessor energy and performance , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[24]  Roc Berenguer,et al.  A fully integrated 23.2dBm P1dB CMOS power amplifier for the IEEE 802.11a with 29% PAE , 2009, Integr..

[25]  Kai Ma,et al.  Temperature-constrained power control for chip multiprocessors with online model estimation , 2009, ISCA '09.

[26]  Frank Bellosa,et al.  Process cruise control: event-driven clock scaling for dynamic power management , 2002, CASES '02.

[27]  Li-Xin Wang,et al.  A Course In Fuzzy Systems and Control , 1996 .

[28]  Panos J. Antsaklis,et al.  An introduction to intelligent and autonomous control , 1993 .

[29]  Daniel Pierre Bovet,et al.  Understanding the Linux Kernel , 2000 .

[30]  Josep Torrellas,et al.  Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors , 2008, 2008 International Symposium on Computer Architecture.

[31]  Mahmut T. Kandemir,et al.  Improving I/O Performance of Applications through Compiler-Directed Code Restructuring , 2008, FAST.

[32]  Manuel E. Acacio,et al.  An energy consumption characterization of on-chip interconnection networks for tiled CMP architectures , 2008, The Journal of Supercomputing.

[33]  Anand Sivasubramaniam,et al.  Managing server energy and operational costs in hosting centers , 2005, SIGMETRICS '05.