On the power of sigmoid neural networks

We investigate the power of recurrent neural networks that apply the standard sigmoid activation function: a(z) = [2/(1 + e-”)] -1. We show that in the noiseless model, there exists a universal architecture that can be used to compute any recursive function. As a result, basic convergence questions concerning these architectures are shown to be undecidi~ble even for fixed-size networks. This is the first result of its kind for the standard sigmoid activation function; previous techniques only applied to linearized and truncated versions of this function. The significance of our result, besides the proving technique itself, lies in the popularity of the sigmoidal function both in applications of artificial neural networks and in models of biological neural networks. Our techniques can be applied to a much more general class of “sigmoid-like” activation functions, suggesting that Turing universality is a relatively common property of recurrent neural network models.