Rescaling of variables in back propagation learning

Abstract Use of the logistic derivative in backward error propagation suggests one source of ill-conditioning to be the decreasing multiplier in the computation of the elements of the gradient at each layer. A compensatory rescaling is suggested, based heuristically upon the expected value of the multiplier. Experimental results demonstrate an order of magnitude improvement in convergence.