Towards a more analytical training of neural networks and neuro-fuzzy systems

When used for function approximation purposes, neural networks belong to a class of models whose parameters can be separated into linear and nonlinear, according to their influence in the model output. In this work we extend this concept to the case where the training problem is formulated as the minimization of the integral of the squared error, along the input domain. With this approach, the gradient-based non-linear optimization algorithms require the computation of terms that are either dependent only on the model and the input domain, and terms which are the projection of the target function on the basis functions and on their derivatives with respect to the nonlinear parameters. These latter terms can be numerically computed with the data provided. The use of this functional approach brings at least two advantages in comparison with the standard training formulation: firstly, computational complexity savings, as some terms are independent on the size of the data and matrices inverses or pseudo-inverses are avoided; secondly, as the performance surface using this approach is closer to the one obtained with the true (typically unknown) function, the use of gradient-based training algorithms has more chance to find models that produce a better fit to the underlying function.

[1]  Kenneth Levenberg A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[2]  L. Kaufman A variable projection method for solving separable nonlinear least squares problems , 1974 .

[3]  A. Ruano,et al.  Exploiting the functional training approach in Radial Basis Function networks , 2011, 2011 IEEE 7th International Symposium on Intelligent Signal Processing.

[4]  Gene H. Golub,et al.  The differentiation of pseudo-inverses and non-linear least squares problems whose variables separate , 1972, Milestones in Matrix Computation.

[5]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[6]  M. Viberg,et al.  Separable non-linear least-squares minimization-possible improvements for neural net fitting , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[7]  António E. Ruano,et al.  Exploiting the separability of linear and nonlinear parameters in radial basis function networks , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[8]  Linda Kaufman,et al.  A Variable Projection Method for Solving Separable Nonlinear Least Squares Problems , 1974 .

[9]  Peter J. Fleming,et al.  A connectionist approach to PID autotuning , 1991 .

[10]  G. Golub,et al.  Separable nonlinear least squares: the variable projection method and its applications , 2003 .

[11]  László T. Kóczy,et al.  Supervised training algorithms for B-Spline neural networks and neuro-fuzzy systems , 2002, Int. J. Syst. Sci..

[12]  Peter J. Fleming,et al.  A new formulation of the learning problem of a neural network controller , 1991, [1991] Proceedings of the 30th IEEE Conference on Decision and Control.

[13]  Philip E. Gill,et al.  Practical optimization , 1981 .