论文信息 - Highly Fault-Tolerant Parallel Computation Extended Abstract*

Highly Fault-Tolerant Parallel Computation Extended Abstract*

We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an errorcorrecting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable euen if each of their components can fail with some constant probability. We consider fine-grained parallel computations in which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w logo(’) w processors and time t logo(’) w. The failure probability of the computation will be at most t . e~p(-w’/~). The codes used to communicate with our faulttolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in 0 nlogo(’) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.

Daniel A. Spielman | D. Spielman

[1] Alfred V. Aho,et al. The Design and Analysis of Computer Algorithms , 1974 .

[2] Joseph JáJá,et al. An Introduction to Parallel Algorithms , 1992 .

[3] Ronitt Rubinfeld. Batch Checking with Applications to Linear Functions , 1992, Inf. Process. Lett..

[4] Rüdiger Reischuk,et al. Reliable computation with noisy circuits and decision trees-a general n log n lower bound , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[5] D. Spielman,et al. Expander codes , 1996 .

[6] Madhu Sudan,et al. Efficient Checking of Polynomials and Proofs and the Hardness of Appoximation Problems , 1995, Lecture Notes in Computer Science.

[7] Leonid A. Levin,et al. Checking computations in polylogarithmic time , 1991, STOC '91.

[8] Rudolf Ahlswede. Improvements of Winograd's result on computation in the presence of noise , 1984, IEEE Trans. Inf. Theory.

[9] Daniel A. Spielman,et al. Nearly-linear size holographic proofs , 1994, STOC '94.

[10] Nicholas Pippenger. Invariance of complexity measures for networks with unreliable gates , 1989, JACM.

[11] Manuel Blum,et al. Self-testing/correcting with applications to numerical problems , 1990, STOC '90.

[12] Shmuel Winograd,et al. Coding for Logical Operations , 1962, IBM J. Res. Dev..

[13] Péter Gács,et al. Lower bounds for the complexity of reliable Boolean circuits with noisy gates , 1994, IEEE Trans. Inf. Theory.

[14] J. H. van Lint,et al. Introduction to Coding Theory , 1982 .

[15] Nicholas Pippenger,et al. On networks of noisy gates , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[16] J. von Neumann,et al. Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .

[17] Volker Strassen,et al. The Computational Complexity of Continued Fractions , 1983, SIAM J. Comput..

[18] Anna Gál,et al. Fault tolerant circuits and probabilistically checkable proofs , 1995, Proceedings of Structure in Complexity Theory. Tenth Annual IEEE Conference.