Adapting Application Mapping to Systematic Within-Die Process Variations on Chip Multiprocessors

Process variations, which lead to timing and power variations across identically-designed components, have been identified as one of the key future design challenges by the semiconductor industry. Using worst case latency/power assumptions is one option to address process variations. This option, while simplifying the problem, is becoming less and less attractive as its performance and power costs keep increasing. As a result, exploring options that allow the software to have knowledge about the actual latency/power consumption values is critical for future systems. Targeting systematic process variations, this paper makes two contributions. First, we discuss how we can assign threads to the cores of a chip multiprocessor (CMP) with process variations in mind and show the energy-delay product (EDP) benefits such a process variation-aware thread mapping can bring. Second, we study the benefits of varying the frequencies on a subset of the cores to increase EDP savings. We propose and evaluate integer linear programming based thread mapping schemes in both studies. While these schemes operate with profile data, they can be made to work with partial profiling as well with the help of curve fitting. We tested our schemes using both sequential and multi-threaded benchmarks from different suites and the results collected indicate that we can achieve EDP savings as much as 73.4%, with an average saving of 37.1% over a process variation agnostic scheme.

[1]  Mark Horowitz,et al.  Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.

[2]  Peter K. Szwed,et al.  Application-level checkpointing for shared memory programs , 2004, ASPLOS XI.

[3]  Josep Torrellas,et al.  Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors , 2008, 2008 International Symposium on Computer Architecture.

[4]  Kaushik Roy,et al.  Variation Resilient Low-Power Circuit Design Methodology using On-Chip Phase Locked Loop , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[5]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[6]  Margaret Martonosi,et al.  Power Efficiency for Variation-Tolerant Multicore Processors , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[7]  David M. Brooks,et al.  Mitigating the Impact of Process Variations on Processor Register Files and Execution Units , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[8]  Konrad K. Lai,et al.  The Impact of Performance Asymmetry in Emerging Multicore Architectures , 2005, ISCA 2005.

[9]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[10]  Michael Philippsen,et al.  Near Overhead-free Heterogeneous Thread-migration , 2005, 2005 IEEE International Conference on Cluster Computing.

[11]  Patrick Crowley,et al.  Dynamic thread assignment on heterogeneous multiprocessor architectures , 2006, CF '06.

[12]  Milos D. Ercegovac,et al.  The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics Acceleration , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[13]  Munkang Choi Modeling of Deterministic Within-Die Variation in Timing Analysis, Leakage current Analysis, and Delay Fault Diagnosis , 2007 .

[14]  Engin Ipek,et al.  Core fusion: accommodating software diversity in chip multiprocessors , 2007, ISCA '07.

[15]  Norman P. Jouppi,et al.  Core architecture optimization for heterogeneous chip multiprocessors , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[17]  Eric S. Fetzer Using Adaptive Circuits to Mitigate Process Variations in a Microprocessor Design , 2006, IEEE Design & Test of Computers.

[18]  Dejan S. Milojicic,et al.  Process migration , 1999, ACM Comput. Surv..

[19]  Costas J. Spanos,et al.  Optimum sampling for characterization of systematic variation in photolithography , 2002, SPIE Advanced Lithography.

[20]  Shekhar Y. Borkar VLSI Design Challenges for Gigascale Integration , 2005, VLSI Design.

[21]  S. Winkel Optimal versus Heuristic Global Code Scheduling , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[22]  John K. Bennett,et al.  Raptor: integrating checkpoints and thread migration for cluster management , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..

[23]  James Tschanz,et al.  Impact of Parameter Variations on Circuits and Microarchitecture , 2006, IEEE Micro.

[24]  Kevin Skadron,et al.  Impact of process variations on multicore performance symmetry , 2007 .

[25]  Andrew B. Kahng,et al.  Lens Aberration Aware Timing-Driven Placement , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[26]  Kaushik Roy,et al.  Process Variation Tolerant Low Power DCT Architecture , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[27]  Christian Prins,et al.  Applications of optimisation with Xpress-MP , 2002 .

[28]  Ke Meng,et al.  Process Variation Aware Cache Leakage Management , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.