Full-system chip multiprocessor power evaluations using FPGA-based emulation

The design process for chip multiprocessors (CMPs) requires extremely long simulation times to explore performance, power, and thermal issues, particularly when operating system (OS) effects are included. In response, our novel FPGA-based emulation methodology models a full CMP design including applications and an OS. Activity counters programmed into the cores feed per-component microarchitectural power models. These models achieve under 10% error compared to detailed gate-level simulations. Our method retains software flexibility, but offers up to 35x speedup compared to corresponding full-system software simulations. We present our approach by emulating a 2-core Leon3 cache-coherent multiprocessor running Linux and parallel benchmarks. In an example case study, our emulated system uses activity counts (a proxy for temperature) to guide process migration between the CMP cores. Overall, this paper's methodology makes possible detailed power and thermal studies of CMPs and their operating systems.

[1]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[2]  Srivaths Ravi,et al.  Power emulation: a new paradigm for power estimation , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[3]  Mahmut T. Kandemir,et al.  Energy-driven integrated hardware-software optimizations using SimplePower , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[4]  Margaret Martonosi,et al.  Runtime power monitoring in high-end processors: methodology and empirical data , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[5]  Frank Bellosa,et al.  The benefits of event: driven energy accounting in power-sensitive systems , 2000, ACM SIGOPS European Workshop.

[6]  Margaret Martonosi,et al.  Techniques for Multicore Thermal Management: Classification and New Exploration , 2006, ISCA 2006.

[7]  Christoforos E. Kozyrakis,et al.  RAMP: Research Accelerator for Multiple Processors , 2007, IEEE Micro.

[8]  Alan D. George,et al.  Parallel simulation of chip-multiprocessor architectures , 2002, TOMC.

[9]  Margaret Martonosi,et al.  Run-time power estimation in high performance microprocessors , 2001, ISLPED '01.

[10]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[11]  John Wawrzynek,et al.  Research accelerator for multiple processors , 2006, 2006 IEEE Hot Chips 18 Symposium (HCS).

[12]  Michel Dubois,et al.  The Design of RPM: An FPGA-based Multiprocessor Emulator , 1995, Third International ACM Symposium on Field-Programmable Gate Arrays.

[13]  Margaret Martonosi,et al.  Power prediction for Intel XScale/spl reg/ processors using performance monitoring unit events , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[14]  R. Viswanath Thermal Performance Challenges from Silicon to Systems , 2000 .

[15]  Luca Benini,et al.  A fast HW/SW FPGA-based thermal emulation framework for multi-processor system-on-chip , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[16]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17]  Alan J. Weger,et al.  Thermal-aware task scheduling at the system software level , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[18]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[19]  Anand Raghunathan,et al.  Accelerating System-on-Chip Power Analysis Using Hybrid Power Estimation , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[20]  Babak Falsafi,et al.  PROToFLEX: FPGA-accelerated Hybrid Functional Simulator , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[21]  John Wawrzynek,et al.  BEE2: a high-end reconfigurable computing system , 2005, IEEE Design & Test of Computers.

[22]  Joel Emer,et al.  Implementing a Functional / Timing Partitioned Microprocessor Simulator with an FPGA , 2006 .

[23]  Ravi Mahajan,et al.  The Evolution of Microprocessor Packaging , 2000 .

[24]  Krste Asanovic,et al.  Reducing power density through activity migration , 2003, ISLPED '03.