Markov chain algorithms: A template for building future robust low power systems

Although computational systems are looking towards post CMOS devices in the pursuit of lower power, the inherent unreliability of such devices makes it difficult to design robust systems without additional power overheads for guaranteeing robustness. As such, algorithmic structures with inherent ability to tolerate computational errors are of significant interest. We propose to cast applications as stochastic algorithms based on Markov chains as such algorithms are both sufficiently general and tolerant to transition errors. We show with four example applications - boolean satisfiability (SAT), sorting, LDPC decoding and clustering - how applications can be cast as Markov Chain algorithms. Using algorithmic fault injection techniques, we demonstrate the robustness of these implementations to transition errors with high error rates. Based on these results, we make a case for using Markov Chains as an algorithmic template for future robust low power systems.

[1]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries , 2004, JACM.

[2]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[3]  Thomas Stützle,et al.  SATLIB: An Online Resource for Research on SAT , 2000 .

[4]  Holger H. Hoos,et al.  Stochastic local search - methods, models, applications , 1998, DISKI.

[5]  Michael J. Black,et al.  A nonparametric Bayesian alternative to spike sorting , 2008, Journal of Neuroscience Methods.

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  Marc P. C. Fossorier,et al.  A modified weighted bit-flipping decoding of low-density Parity-check codes , 2004, IEEE Communications Letters.

[8]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[9]  Rakesh Kumar,et al.  An algorithmic approach to error localization and partial recomputation for low-overhead fault tolerance , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[10]  Eric P. Xing,et al.  Parallel Markov Chain Monte Carlo for Nonparametric Mixture Models , 2013, ICML.

[11]  Donald W. Loveland,et al.  A machine program for theorem-proving , 2011, CACM.

[12]  Jacob A. Abraham,et al.  Algorithm-Based Fault Tolerance for Matrix Operations , 1984, IEEE Transactions on Computers.

[13]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries , 2001, STOC '01.

[14]  Erik B. Sudderth Graphical models for visual object recognition and tracking , 2006 .

[15]  Julian Besag,et al.  Markov Chain Monte Carlo for Statistical Inference , 2002 .

[16]  D. Stroock An Introduction to Markov Processes , 2004 .

[17]  Alistair Sinclair,et al.  Random Walks on Truncated Cubes and Sampling 0-1 Knapsack Solutions , 2004, SIAM J. Comput..

[18]  Adrian Barbu,et al.  Generalizing Swendsen-Wang to sampling arbitrary posterior probabilities , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  David Blaauw,et al.  Razor II: In Situ Error Detection and Correction for PVT and SER Tolerance , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[20]  Joshua B. Tenenbaum,et al.  Natively probabilistic computation , 2009 .

[21]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[22]  Radu Marculescu,et al.  The Search for Alternative Computational Paradigms , 2008, IEEE Design & Test of Computers.

[23]  Rakesh Kumar,et al.  A numerical optimization-based methodology for application robustification: Transforming applications for error tolerance , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[24]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[25]  Bart Selman,et al.  Local search strategies for satisfiability testing , 1993, Cliques, Coloring, and Satisfiability.