Error tolerance : Why and how to use slightly defective digital systems *

We provide an overview of the notion of error tolerance and describe the context that motivated its development. We then present a summary of some of our case studies, which demonstrate the significant potential benefits of error tolerance. We present a summary of testing and design techniques that we have developed for error tolerant systems. Finally, we conclude by identifying shifts in paradigm required for wide exploitation of error tolerance. The notion of error tolerance is motivated by three important trends in information processing, namely changes in fabrication technology, changes in the mix of applications, and emergence of new paradigms of computation. Fabrication technology: As we get closer to what some call the " end of CMOS " , we see the emergence of highly unreliable and defect-prone technologies. This is accompanied by rapid development of new computing technologies such as bio, molecular, and quantum devices. Most of these new technologies are also extremely unreliable and defect-prone (e.g., see [12]). However, these new technologies also provide the ability to carry out massive numbers of computations in parallel and at speeds that far exceed those currently achieved by CMOS devices. Applications: Increasingly larger fractions of the total number of chips fabricated in any given year implement multi-media applications and process signals representing audio, speech, images, video and graphics. The outputs of such systems eventually become input signals to human users. There are several interesting aspects to the computational requirements for such systems. 1) The result of computation, i.e., the output data, is not measured in terms of being right or wrong, but rather on perceptual quality to its human users. For example, in the case of an image the perceptual quality may be defined in terms of absence of visible artifacts, clarity, color and intensity. In other words, the criterion is not correctness but whether the end product is acceptable to the human user. 2) Most such systems are by design lossy, in the sense that the outputs deviate from perfection due to sampling of input signals, conversion to digital, quantization, lossy encoding, decoding and conversion to analog signals. 3) Many such applications require parallel architectures as they are computationally intensive and have real-time performance constraints. Emerging paradigms of computation: Several new paradigms are emerging on how functions are computed and what requirements are placed on the " correctness " and " accuracy " of the results. With tongue in cheek, …

[1]  Sandeep Gupta,et al.  Multi-Vector Tests: A Path to Perfect Error-Rate Testing , 2008, 2008 Design, Automation and Test in Europe.

[2]  Thierry Paul,et al.  Quantum computation and quantum information , 2007, Mathematical Structures in Computer Science.

[3]  Hye-Yeon Cheong,et al.  Distance Quantization Method for Fast Nearest Neighbor Search Computations with applications to Motion Estimation , 2007, 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers.

[4]  Sandeep K. Gupta,et al.  ERTG: A test generator for error-rate testing , 2007, 2007 IEEE International Test Conference.

[5]  Antonio Ortega,et al.  Power Efficient Motion Estimation using Multiple Imprecise Metric Computations , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[6]  Melvin A. Breuer,et al.  Estimating Error Rate in Defective Logic Using Signature Analysis , 2007, IEEE Transactions on Computers.

[7]  M. Breuer,et al.  Reduction of Detected Acceptable Faults for Yield Improvement via Error-Tolerance , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[8]  Antonio Ortega,et al.  Dynamic Voltage Scaling Algorithms for Power Constrained Motion Estimation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9]  Antonio Ortega,et al.  Motion estimation performance models with application to hardware error tolerance , 2007, Electronic Imaging.

[10]  Melvin A. Breuer,et al.  Error-tolerance and multi-media , 2006, 2006 International Conference on Intelligent Information Hiding and Multimedia.

[11]  K. Chugg,et al.  Irregular Designs for Two-State Systematic with Serial Concatenated Parity Codes , 2006, MILCOM 2006 - 2006 IEEE Military Communications conference.

[12]  Naresh R. Shanbhag,et al.  Energy-efficient Motion Estimation using Error-Tolerance , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[13]  Antonio Ortega,et al.  Computation Error Tolerance in Motion Estimation Algorithms , 2006, 2006 International Conference on Image Processing.

[14]  Sandeep K. Gupta,et al.  Estimating Error Rate during Self-Test via One's Counting , 2006, 2006 IEEE International Test Conference.

[15]  Sandeep Gupta,et al.  A Theory of Error-Rate Testing , 2006, 2006 International Conference on Computer Design.

[16]  Melvin A. Breuer,et al.  An error-oriented test methodology to improve yield with error-tolerance , 2006, 24th IEEE VLSI Test Symposium.

[17]  Keith M. Chugg,et al.  An Iterative Algorithm and Low Complexity Hardware Architecture for Fast Acquisition of Long PN Codes in UWB Systems , 2006, J. VLSI Signal Process..

[18]  Krishna V. Palem,et al.  Ultra-Efficient (Embedded) SOC Architectures based on Probabilistic CMOS (PCMOS) Technology , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[19]  Sandeep K. Gupta,et al.  Threshold testing: Covering bridging and other realistic faults , 2005, 14th Asian Test Symposium (ATS'05).

[20]  Melvin A. Breuer,et al.  A novel test methodology based on error-rate to support error-tolerance , 2005, IEEE International Conference on Test, 2005..

[21]  Antonio Ortega,et al.  Hardware testing for error tolerant multimedia compression based on linear transforms , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[22]  Antonio Ortega,et al.  Analysis and testing for error tolerant motion estimation , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[23]  Melvin A. Breuer,et al.  Multi-media applications and imprecise computation , 2005, 8th Euromicro Conference on Digital System Design (DSD'05).

[24]  Melvin A. Breuer,et al.  Let's think analog , 2005, IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design (ISVLSI'05).

[25]  Krishna V. Palem,et al.  Ultra Low-energy Computing via Probabilistic Algorithms and Devices: CMOS Device Primitives and the Energy-Probability Relationship , 2004 .

[26]  Radu Marculescu,et al.  Modeling, Analysis, and Self-Management of Electronic Textiles , 2003, IEEE Trans. Computers.

[27]  Sandeep K. Gupta,et al.  An ATPG for threshold testing: obtaining acceptable yield in future processes , 2002, Proceedings. International Test Conference.

[28]  G. Palm,et al.  Computing with neural networks. , 1987, Science.

[29]  Alex A. Freitas,et al.  Evolutionary Computation , 2002 .

[30]  Antonio Ortega,et al.  NEW QUALITY METRICS FOR MULTIMEDIA COMPRESSION USING FAULTY HARDWARE In , 2006 .