A Comparative Evaluation of Designs for Reliable Memory Systems

This paper addresses the design of storage systems for operation under critical environmental conditions. For these applications, these systems should have low latency time in access, high performance in throughput and high storage capabilities; therefore, they must be assembled using highly reliable components, while allowing flexibility in design. Commercial Off The Shelf (COTS) components have often been used. A COTS-based architecture is analyzed in this paper; the proposed architecture uses design-level techniques (such as error detection/correction codes and scrubbing) to make commercially available Dynamic Random Access Memory (DRAM) chips tolerant to faults. This paper provides a complete and novel analysis of engineering alternatives which arise in the design of a highly reliable memory system based on Reed Solomon coding. A comparative analysis of methods for permanent fault detection is provided; moreover using a Markovian characterization, different functional arrangements (based on code and scrubbing frequency) are investigated and evaluated.

[1]  T. G. Noll,et al.  A new scalable VLSI architecture for Reed-Solomon decoders , 1998, Proceedings of the IEEE 1998 Custom Integrated Circuits Conference (Cat. No.98CH36143).

[2]  Guu-Chang Yang Reliability of semiconductor RAMs with soft-error scrubbing techniques , 1995 .

[3]  Zeljko Zilic,et al.  Design and implementation of error detection and correction circuitry for multilevel memory protection , 2002, Proceedings 32nd IEEE International Symposium on Multiple-Valued Logic.

[4]  Janak H. Patel,et al.  Reliability of scrubbing recovery-techniques for memory systems , 1990 .

[5]  R. Z. Makki,et al.  SRAM test using on-chip dynamic power supply current sensor , 1998, Proceedings. International Workshop on Memory Technology, Design and Testing (Cat. No.98TB100236).

[6]  Yasunao Katayama,et al.  Efficient error correction code configurations for quasi-nonvolatile data retention by DRAMs , 2000, Proceedings IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[7]  J.M. Soden I/sub DDQ/ testing for submicron CMOS IC technology qualification , 1997, Digest of Papers IEEE International Workshop on IDDQ Testing.

[8]  J.L. Massey,et al.  Theory and practice of error control codes , 1986, Proceedings of the IEEE.

[9]  Hyunchul Shin,et al.  An area-efficient VLSI architecture of a Reed-Solomon decoder/encoder for digital VCRs , 1997 .

[10]  Parag K. Lala,et al.  Fault tolerant and fault testable hardware design , 1985 .

[11]  J. Barth,et al.  A 5.6 ns random cycle 144 Mb DRAM with 1.4 Gb/s/pin and DDR3-SRAM interface , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[12]  A. H. Johnston Radiation effects in advanced microelectronics technologies , 1997 .

[13]  Hans P. Muhlfeld,et al.  Cosmic ray soft error rates of 16-Mb DRAM memory chips , 1998, IEEE J. Solid State Circuits.

[14]  Fabrizio Lombardi,et al.  Markov models of fault-tolerant memory systems under SEU , 2004, Records of the 2004 International Workshop on Memory Technology, Design and Testing, 2004..

[15]  Adelio Salsano,et al.  Design of fault-tolerant solid state mass memory , 1999, Proceedings 1999 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (EFT'99).

[16]  A. Ross,et al.  Suitability of COTS IBM 64M DRAM in space , 1997, RADECS 97. Fourth European Conference on Radiation and its Effects on Components and Systems (Cat. No.97TH8294).

[17]  S. Simmons,et al.  A study on the VLSI implementation of ECC for embedded DRAM , 2003, CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436).

[18]  Christof Paar,et al.  Comparison of arithmetic architectures for Reed-Solomon decoders in reconfigurable hardware , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[19]  Zhao Wu,et al.  Fault-tolerant refresh power reduction of DRAMs for quasi-nonvolatile data retention , 1999, Proceedings 1999 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (EFT'99).

[20]  A. Dinh,et al.  Design of a high-speed (255,239) RS decoder using 0.18 /spl mu/m CMOS , 2004, Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513).

[21]  Raoul Velazco,et al.  THESIC: A testbed suitable for the qualification of integrated circuits devoted to operate in harsh environment , 1998 .

[22]  J. F. Ziegler,et al.  Terrestrial cosmic ray intensities , 1998, IBM J. Res. Dev..