论文信息 - On the power of sigmoid neural networks

On the power of sigmoid neural networks

We investigate the power of recurrent neural networks that apply the standard sigmoid activation function: a(z) = [2/(1 + e-”)] -1. We show that in the noiseless model, there exists a universal architecture that can be used to compute any recursive function. As a result, basic convergence questions concerning these architectures are shown to be undecidi~ble even for fixed-size networks. This is the first result of its kind for the standard sigmoid activation function; previous techniques only applied to linearized and truncated versions of this function. The significance of our result, besides the proving technique itself, lies in the popularity of the sigmoidal function both in applications of artificial neural networks and in models of biological neural networks. Our techniques can be applied to a much more general class of “sigmoid-like” activation functions, suggesting that Turing universality is a relatively common property of recurrent neural network models.

Hava T. Siegelmann | Joe Kilian

[1] Hava T. Siegelmann,et al. Neural Networks With Real Weights: Analog Computational Complexity , 1992 .

[2] W S McCulloch,et al. A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[3] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[4] Marvin Minsky,et al. Computation : finite and infinite machines , 2016 .

[5] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[6] Hava T. Siegelmann,et al. On the computational power of neural nets , 1992, COLT '92.

[7] James L. McClelland,et al. Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[8] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .

[9] C. Lee Giles,et al. Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.