Design of a predictive filter cache for energy savings in high performance processor architectures

Filter cache has been proposed as an energy saving architectural feature. A filter cache is placed between the CPU and the instruction cache (I-cache) to provide the instruction stream. Energy savings result from accesses to a small cache. There is however loss of performance when instructions are not found in the filter cache. The majority of the energy savings from the filter cache are due to the temporal reuse of instructions in small loops. We examine subsequent fetch addresses to predict whether the next fetch address is in the filter cache dynamically. In case a miss is predicted, we reduce miss penalty by accessing the I-cache directly. Experimental results show that our next fetch prediction reduces performance penalty by more than 91% and is more energy efficient than a conventional filter cache. Average I-cache energy savings of 31 % can be achieved by our filter cache design with around 1 % performance degradation.

[1]  Ibrahim N. Hajj,et al.  Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[2]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[3]  Dirk Grunwald,et al.  Confidence estimation for speculation control , 1998, ISCA.

[4]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[5]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[6]  Allan Tzeng,et al.  UltraSPARC-II/: expanding the boundaries of a system on a chip , 1998, IEEE Micro.

[7]  Kazuaki Murakami,et al.  Way-predicting set-associative cache for high performance and low energy consumption , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[8]  Eric Rotenberg,et al.  Trace cache: a low latency approach to high bandwidth instruction fetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[9]  Sanjive Agarwala,et al.  A multi-level memory system architecture for high performance DSP applications , 2000, Proceedings 2000 International Conference on Computer Design.

[10]  Ibrahim N. Hajj,et al.  Using dynamic cache management techniques to reduce energy in a high-performance processor , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[11]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[12]  Norman P. Jouppi,et al.  WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches , 1994 .