This paper introduces a variant of the ADALINE in which the input signals are normalized to have zero mean and unit variance, and in which the bias or “threshold weight” is learned slightly differently. These changes result in a linear learning element that learns much more efficiently and rapidly, and that is much less dependent on the choice of the step-size parameter. Using simulation experiments, learning time improvements of from 30% to hundreds of times are shown. The memory and computational complexity of the new element remains O(N) , where N is the number of input signals, and the added computations are entirely local. Theoretical analysis indicates that the new element learns optimally fast in a certain sense and to the extent that the input signals are statistically independent.
[1]
James L. McClelland,et al.
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
,
1986
.
[2]
Bernardo A. Huberman,et al.
AN IMPROVED THREE LAYER, BACK PROPAGATION ALGORITHM
,
1987
.
[3]
S. Thomas Alexander,et al.
Adaptive Signal Processing
,
1986,
Texts and Monographs in Computer Science.
[4]
Geoffrey E. Hinton,et al.
Learning internal representations by error propagation
,
1986
.
[5]
Ronald L. Iman,et al.
A Modern Approach To Statistics
,
1983
.