Asymptotic Convergence of Backpropagation: Numerical Experiments

We have calculated, both analytically and in simulations, the rate of convergence at long times in the backpropagation learning algorithm for networks with and without hidden units. Our basic finding for units using the standard sigmoid transfer function is 1/t convergence of the error for large t, with at most logarithmic corrections for networks with hidden units. Other transfer functions may lead to a slower polynomial rate of convergence. Our analytic calculations were presented in (Tesauro, He & Ahamd, 1989). Here we focus in more detail on our empirical measurements of the convergence rate in numerical simulations, which confirm our analytic results.