Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches

With the widening gap between processor and memory speeds, memory system designers may find cache compression beneficial to increase cache capacity and reduce off-chip bandwidth. Most hardware compression algorithms fall into the dictionary-based category, which depend on building a dictionary and using its entries to encode repeated data values. Such algorithms are effective in compressing large data blocks and files. Cache lines, however, are typically short (32-256 bytes), and a per-line dictionary places a significant overhead that limits the compressibility and increases decompression latency of such algorithms. For such short lines, significance-based compression is an appealing alternative. We propose and evaluate a simple significance-based compression scheme that has a low compression and decompression overhead. This scheme, Frequent Pattern Compression (FPC) compresses individual cache lines on a word-by-word basis by storing common word patterns in a compressed format accompanied with an appropriate prefix. For a 64-byte cache line, compression can be completed in three cycles and decompression in five cycles, assuming 12 FO4 gate delays per cycle. We propose a compressed cache design in which data is stored in a compressed form in the L2 caches, but are uncompressed in the L1 caches. L2 cache lines are compressed to predetermined sizes that never exceed their original size to reduce decompression overhead. This simple scheme provides comparable compression ratios to more complex schemes that have higher cache hit latencies.

[1]  Michael E. Wazlowski,et al.  Pinnacle: IBM MXT in a Memory Controller Chip , 2001, IEEE Micro.

[2]  Jun Yang,et al.  Frequent value locality and value-centric data cache design , 2000, SIGP.

[3]  David A. Wood,et al.  Adaptive cache compression for high-performance processors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[4]  David A. Wood,et al.  Variability in architectural simulations of multi-threaded workloads , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[5]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[6]  Shin-Dug Kim,et al.  Adaptive Methods to Minimize Decompression Overhead for Compressed On-Chip Caches , 2003 .

[7]  John T. Robinson,et al.  On internal organization in compressed random-access memories , 2001, IBM J. Res. Dev..

[8]  Rajiv Gupta,et al.  Data Compression Transformations for Dynamically Allocated Data Structures , 2002, CC.

[9]  Jun Yang,et al.  Frequent value locality and its applications , 2002, TECS.

[10]  Steven K. Reinhardt,et al.  A fully associative software-managed cache design , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[11]  Jun Yang,et al.  Frequent value compression in data caches , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[12]  Luca Benini,et al.  Hardware-assisted data compression for energy minimization in systems with embedded processors , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[13]  Jang-Soo Lee,et al.  Design and evaluation of a selective compressed memory system , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[14]  R. Canal,et al.  Very low power pipelines using significance compression , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[15]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[16]  K. Kant Compressibility Characteristics of Address / Data Transfers in Commercial Workloads , 2002 .

[17]  Milo M. K. Martin,et al.  Simulating a $ 2 M Commercial Server on a $ 2 K PC T , 2001 .

[18]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[19]  Michael E. Wazlowski,et al.  IBM Memory Expansion Technology (MXT) , 2001, IBM J. Res. Dev..

[20]  S. Jones,et al.  Design and performance of a main memory hardware data compressor , 1996, Proceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies.

[21]  Sung-Mo Kang,et al.  Effective algorithms for cache-level compression , 2001, GLSVLSI '01.

[22]  Jang-Soo Lee,et al.  An on-chip cache compression technique to reduce decompression overhead and design complexity , 2000, J. Syst. Archit..

[23]  Jun Yang,et al.  Frequent Value Locality and Value-Centric Data Cache Design , 2000, ASPLOS.

[24]  Xiaowei Shen,et al.  Performance of hardware compressed main memory , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[25]  Luca Benini,et al.  An adaptive data compression scheme for memory traffic minimization in processor-based systems , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[26]  Jun Yang,et al.  Energy efficient Frequent Value data Cache design , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[27]  Larry Rudolph,et al.  A Dynamically Partitionable Compressed Cache , 2003 .

[28]  André Seznec Decoupled Sectored Caches , 1997, IEEE Trans. Computers.

[29]  Arvin Park,et al.  Dynamic base register caching: a technique for reducing address bus width , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.

[30]  Larry Rudolph,et al.  Creating a wider bus using caching techniques , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[31]  James E. Smith,et al.  Very low power pipelines using significance compression , 2000, MICRO 33.

[32]  John T. Robinson,et al.  Parallel compression with cooperative dictionary construction , 1996, Proceedings of Data Compression Conference - DCC '96.