Generalized guard-zone algorithm (GGA) for learning: automatic selection of threshold

Abstract This work is a continuation of our earlier work on the Generalized Guard-zones Algorithm (GGA) for self-supervised parameter learning. An attempt is made here for the automatic determination of the guard-zone parameter λn (i.e. the threshold used for discarding doubtful or mislabeled samples) at every instant of learning, for the general m-class N-feature pattern recognition problem. This is done by minimizing the mean squared error (MSE) of the estimate, under a simple probabilistic model which takes into consideration the presence of mislabeled training samples. Under the assumptions of normality, it is found that the estimates for λn, so obtained are distribution-free, that is, they do not depend on the parameters of the distribution. They are functions of N, the iteration number n and certain percentage points of the beta distribution with parameters N and n - N. The effectiveness of the automatic selection of guard-zone dimension is further demonstrated on a bivariate three-class data set to show the improvement in performance of the GGA.