Memory state compressors for giga-scale checkpoint/restore

We propose a checkpoint store compression method for coarse-grain giga-scale checkpoint/restore. This mechanism can be useful for debugging, post-mortem analysis and error recovery. Our compression method exploits value locality in the memory data and address streams. Our compressors require few resources, can be easily pipelined and can process a full cache block per processor cycle. We study two applications of our compressors for post-mortem analysis: (1) using them alone, and (2) using them in-series with a dictionary-based compressor. When used alone they offer competitive compression rates in most cases. When combined with dictionary compressors, they significantly reduce on-chip buffer requirements.

[1]  Luca Benini,et al.  An adaptive data compression scheme for memory traffic minimization in processor-based systems , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[2]  Kenneth C. Yeager The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.

[3]  Martin Burtscher,et al.  Compressing extended program traces using value predictors , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[4]  M. Ekman,et al.  A robust main-memory compression scheme , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[5]  Shin-Dug Kim,et al.  Adaptive Methods to Minimize Decompression Overhead for Compressed On-Chip Caches , 2003 .

[6]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[7]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[8]  Michael E. Wazlowski,et al.  IBM Memory Expansion Technology (MXT) , 2001, IBM J. Res. Dev..

[9]  S. Jones,et al.  Design and performance of a main memory hardware data compressor , 1996, Proceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies.

[10]  David A. Wood,et al.  Adaptive cache compression for high-performance processors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[11]  Josep Llosa,et al.  Kilo-instruction Processors , 2003, ISHPC.

[12]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[13]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[14]  Andreas Moshovos Checkpointing alternatives for high performance, power-aware processors , 2003, ISLPED '03.

[15]  Wei Liu,et al.  iWatcher: efficient architectural support for software debugging , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[16]  Andrew R. Pleszkun,et al.  Implementing Precise Interrupts in Pipelined Processors , 1988, IEEE Trans. Computers.

[17]  Haitham Akkary,et al.  Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors , 2003, MICRO.

[18]  Michael E. Wazlowski,et al.  Pinnacle: IBM MXT in a Memory Controller Chip , 2001, IEEE Micro.

[19]  S SohiGurindar Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers , 1990 .

[20]  Yale N. Patt,et al.  Checkpoint Repair for High-Performance Out-of-Order Execution Machines , 1987, IEEE Transactions on Computers.

[21]  Jun Yang,et al.  Frequent value compression in data caches , 2000, MICRO 33.

[22]  David A. Wood,et al.  Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches , 2004 .

[23]  Eric Rotenberg,et al.  A large, fast instruction window for tolerating cache misses , 2002, ISCA.

[24]  Monica S. Lam,et al.  Enhancing software reliability with speculative threads , 2002, ASPLOS X.

[25]  John T. Robinson,et al.  Parallel compression with cooperative dictionary construction , 1996, Proceedings of Data Compression Conference - DCC '96.

[26]  Milo M. K. Martin,et al.  SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.

[27]  Satish Narayanasamy,et al.  BugNet: continuously recording program execution for deterministic replay debugging , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[28]  Josep Torrellas,et al.  ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes , 2003, ISCA '03.