On the Hinge-Finding Algorithm for Hinging Hyperplanes

This correspondence concerns the estimation algorithm for hinging hyperplane (HH) models, a piecewise-linear model for approximating functions of several variables, suggested in Breiman (1993). The estimation algorithm is analyzed and it is shown that it is a special case of a Newton algorithm applied to a sum of squared error criterion. This insight is then used to suggest possible improvements of the algorithm so that convergence to a local minimum can be guaranteed. In addition, the way of updating the parameters in the HH model is discussed. In Breiman, a stepwise updating procedure is proposed where only a subset of the parameters are changed in each step. This connects closely to some previously suggested greedy algorithms and these greedy algorithms are discussed and compared to a simultaneous updating of all parameters.

[1]  L. Jones A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .

[2]  Bernard Delyon,et al.  Wavelets in identification , 1994, Fuzzy logic and expert systems applications.

[3]  Lennart Ljung,et al.  Neural Networks in System Identification , 1994 .

[4]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[5]  A. Juditsky,et al.  Wavelets in identification wavelets, splines, neurons, fuzzies : how good for identification , 1994 .

[6]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[7]  Patrick van der Smagt Minimisation methods for training feedforward neural networks , 1994, Neural Networks.

[8]  Predrag Pucar,et al.  Parametrization and Conditioning of Hinging Hyperplane Models , 1996 .

[9]  Leo Breiman,et al.  Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.

[10]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[11]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[12]  Ronald A. DeVore,et al.  Some remarks on greedy algorithms , 1996, Adv. Comput. Math..

[13]  Peter L. Bartlett,et al.  Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.

[14]  L. Ljung,et al.  Overtraining, regularization and searching for a minimum, with application to neural networks , 1995 .

[15]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.