Double backpropagation increasing generalization performance
暂无分享,去创建一个
One test of a training algorithm is how well the algorithm generalizes from the training data to the test data. It is shown that a training algorithm termed double back-propagation improves generalization by simultaneously minimizing the normal energy term found in back-propagation and an additional energy term that is related to the sum of the squares of the input derivatives (gradients). In normal back-propagation training, minimizing the energy function tends to push the input gradient to zero. However, this is not always possible. Double back-propagation explicitly pushes the input gradients to zero, making the minimum broader, and increases the generalization on the test data. The authors show the improvement over normal back-propagation on four candidate architectures with a training set of 320 handwritten numbers and a test set of size 180.<<ETX>>