ITCA: Inter-task Conflict-Aware CPU Accounting for CMPs

Chip-MultiProcessor (CMP) architectures are becoming more and more popular as an alternative to the traditional processors that only extract instruction-level parallelism from an application. CMPs introduce complexities when accounting CPU utilization. This is due to the fact that the progress done by an application during an interval of time highly depends on the activity of the other applications it is co-scheduled with. In this paper, we identify how an inaccurate measurement of the CPU utilization affects several key aspects of the system like the application scheduling or the charging mechanism in data centers. We propose a new hardware CPU accounting mechanism to improve the accuracy when measuring the CPU utilization in CMPs and compare it with the previous accounting mechanisms. Our results show that currently known mechanisms lead to a 19 % average error when it comes to CPU utilization accounting. Our proposal reduces this error to less than 1 % in a modeled 4-core processor system.

[1]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[2]  Michael D. Smith,et al.  Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[3]  Stijn Eyerman,et al.  Per-thread cycle accounting in SMT processors , 2009, ASPLOS.

[4]  Yan Solihin,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[5]  Kunle Olukotun,et al.  A Single-Chip Multiprocessor , 1997, Computer.

[6]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[7]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[8]  Paul Mackerras,et al.  Operating system exploitation of the POWER5 system , 2005, IBM J. Res. Dev..

[9]  Francisco J. Cazorla,et al.  CPU Accounting in CMP Processors , 2009, IEEE Computer Architecture Letters.

[10]  James E. Smith,et al.  Virtual private caches , 2007, ISCA '07.

[11]  Soraya Ghiasi,et al.  System power management support in the IBM POWER6 microprocessor , 2007, IBM J. Res. Dev..

[12]  Irving L. Traiger,et al.  Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..

[13]  Francisco J. Cazorla,et al.  Predictable performance in SMT processors: synergy between the OS and SMTs , 2006, IEEE Transactions on Computers.

[14]  Francisco J. Cazorla,et al.  Architectural support for real-time task scheduling in SMT processors , 2005, CASES '05.

[15]  G. Edward Suh,et al.  A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[16]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[17]  Francisco J. Cazorla,et al.  The MPsim simulation tool , 2009 .

[19]  James E. Smith,et al.  A performance counter architecture for computing accurate CPI components , 2006, ASPLOS XII.

[20]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[21]  Stijn Eyerman,et al.  Per-Thread Cycle Accounting , 2010, IEEE Micro.

[22]  Francisco J. Cazorla,et al.  FlexDCP: a QoS framework for CMP architectures , 2009, OPSR.

[23]  Steven K. Reinhardt,et al.  The impact of resource partitioning on SMT processors , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.