Statistical Behaviour of the GMDH Algorithm

The group method of data handling (GMDH) is a nonparametric regression algorithm that derives complex polynomial models of high degree. Critical features affecting its performance are the set-up of the data set on which it learns, the regularity criterion used for rejecting individual polynomials, the maximum number of polynomials saved per iteration, and the reference function used. Trials with artificial data show that GMDH models fail to perform as well as has been claimed. In particular, GMDH is inferior to regression for processes that are noisy, nonlinear, or multivariate, and for small sets of source data. A major problem is that extrapolations can diverge rapidly from true values of a process unless care is taken.