Using weight decay to optimize the generalization ability of a perceptron

Weight decay was proposed to reduce overfitting which often appears in the generalization tasks of artificial neural nets. Here weight decay is applied to a well defined model system based on a single layer perceptron, which exhibits strong overfitting. Since we know for this system the optimal nonoverfitting solution we can compare the effect of the weight decay with this solution. A strategy to find the optimal weight decay strength, which leads to the optimal solution for any number of examples, is proposed.

[1]  M. Opper,et al.  On the ability of the optimal perceptron to generalise , 1990 .

[2]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[3]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[4]  T. Watkin,et al.  THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .

[5]  Opper,et al.  Generalization ability of perceptrons with continuous outputs. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.