Natural Gradient Descent for Training Multi-Layer Perceptrons

The main diiculty in implementing the natural gradient learning rule is to compute the inverse of the Fisher information matrix when the input dimension is large. We have found a new scheme to represent the Fisher information matrix. Based on this scheme, we have designed an algorithm to compute the inverse of the Fisher information matrix. When the input dimension n is much larger than the number of hidden neurons, the complexity of this algorithm is of order O(n 2) while the complexity of conventional algorithms for the same purpose is of order O(n 3). The simulation has connrmed the eecience and robustness of the natural gradient learning rule.