Evaluation of Snoop-Energy Reduction Techniques for Chip-Multiprocessors

Chip multiprocessors (CMPs) have become an interesting micro-architectural style for high-end systems as well as low-power systems. While power-performance tradeoffs differ in these systems, a high power consumption can lead to devastating power densities in the former and a reduced operating time in the latter owing to limited battery capacity. In this paper, we focus on the energy wasted in the snoopy cache protocols that keep the L1 caches in CMPs consistent. Previous studies have focussed on the energy wasted by snoop accesses in the private caches in SMP systems and found that it can be a big fraction of the total energy. We apply two techniques serial snooping and Jetty that were developed for SMP servers and see if they can lead to energy savings in a CMP. We find that the techniques are not well suited for a CMP and analyze why. Serial snooping does not work well because all caches have to be searched even if none can supply the data, which happens to be the case most of the time. Jetty, does not perform well because the snoop energy saved by the filtering is offset by the energy lost in the filters.

[1]  Masato Edahiro,et al.  A Single-Chip Multiprocessor for Smart Terminals , 2000, IEEE Micro.

[2]  Angelos Bilas,et al.  Real-time parallel MPEG-2 decoding in software , 1997, Proceedings 11th International Parallel Processing Symposium.

[3]  Babak Falsafi,et al.  JETTY: filtering snoops for reduced energy consumption in SMP servers , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[4]  Mikko H. Lipasti,et al.  Power-Efficient Cache Coherence , 2004 .

[5]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[6]  Kunle Olukotun,et al.  The Stanford Hydra CMP , 2000, IEEE Micro.

[7]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[8]  Norman P. Jouppi,et al.  WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches , 1994 .

[9]  Mark Horowitz,et al.  Energy dissipation in general purpose microprocessors , 1996, IEEE J. Solid State Circuits.

[10]  Kevin Reick,et al.  Power4 System Design for High Reliability , 2002, IEEE Micro.

[11]  Trevor Mudge Power: A First Class Design Constraint for Future Architecture and Automation , 2000, HiPC.

[12]  Alan Jay Smith,et al.  A class of compatible cache consistency protocols and their support by the IEEE futurebus , 1986, ISCA '86.

[13]  Håkan Grahn,et al.  SimICS/Sun4m: A Virtual Workstation , 1998, USENIX Annual Technical Conference.