Improving fairness, throughput and energy-efficiency on a chip multiprocessor through DVFS

Recently, a single chip multiprocessor (CMP) is becoming an attractive architecture for improving throughput of program execution. In CMPs, multiple processor cores share several hardware resources such as cache memory and memory bus. Therefore, the resource contention significantly degrades performance of each thread and also loses fairness between threads. In this paper, we propose a Dynamic Frequency and Voltage Scaling (DVFS) algorithm for improving total instruction throughput, fairness, and energy efficiency of CMPs. The proposed technique periodically observes the utilization ratio of shared resources and controls the frequency and the voltage of each processor core individually to balance the ratio between threads. We evaluate our technique and the evaluation results show that fairness between threads are greatly improved by the technique. Moreover, the total instruction throughput increases in many cases while reducing energy consumption.

[1]  William J. Dally,et al.  Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[2]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[3]  Dean M. Tullsen,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.

[4]  Diana Marculescu,et al.  Power and performance evaluation of globally asynchronous locally synchronous processors , 2002, ISCA.

[5]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[6]  Dirk Grunwald,et al.  Methods for modeling resource contention on simultaneous multithreading processors , 2005, 2005 International Conference on Computer Design.

[7]  G. Edward Suh,et al.  A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[8]  M TullsenDean,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000 .

[9]  Yen-Kuang Chen,et al.  The energy efficiency of CMP vs. SMT for multimedia workloads , 2004, ICS '04.

[10]  Michael L. Scott,et al.  Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor , 2003, ISCA '03.

[11]  K. Nose,et al.  Deterministic inter-core synchronization with periodically all-in-phase clocking for low-power multi-core SoCs , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[12]  Manoj Franklin,et al.  Balancing thoughput and fairness in SMT processors , 2001, 2001 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS..

[13]  Michael L. Scott,et al.  Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[14]  S. Kim,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[15]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[16]  Kevin Skadron,et al.  Performance, energy, and thermal considerations for SMT and CMP architectures , 2005, 11th International Symposium on High-Performance Computer Architecture.

[17]  Mahmut T. Kandemir,et al.  Exploiting barriers to optimize power consumption of CMPs , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[18]  Daniel Marcos Chapiro,et al.  Globally-asynchronous locally-synchronous systems , 1985 .

[19]  Sandhya Dwarkadas,et al.  Dynamic frequency and voltage control for a multiple clock domain microarchitecture , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..