This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization.
[1]
Kurt Hornik,et al.
Multilayer feedforward networks are universal approximators
,
1989,
Neural Networks.
[2]
V. Tikhomirov.
On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of one Variable and Addition
,
1991
.
[3]
Thomas G. Dietterich,et al.
Editors. Advances in Neural Information Processing Systems
,
2002
.
[4]
Geoffrey E. Hinton,et al.
Bayesian Learning for Neural Networks
,
1995
.
[5]
Nicolas Le Roux,et al.
Convex Neural Networks
,
2005,
NIPS.
[6]
Christopher K. I. Williams.
Computing with Infinite Networks
,
1996,
NIPS.