The asymptotic optimization of pre-edited ANN classifier

The generalization problem of an artificial neural network (ANN) classifier with unlimited size of training sample, namely asymptotic optimization in probability, is discussed in this paper. As an improved ANN network model, the pre-edited ANN classifier shows better practical performance than the standard one. However, it has not been widely applied due to the absence of the related theoretical support. To further promote its application in practice, the asymptotic optimization of the pre-edited ANN classifier is studied in this paper. To help study ANN asymptotic optimization in probability, we gives a review of the previous research works on asymptotic optimization in probability of non-parametric classifier, and grouped the main methods into four classes: two-step method, one-step method, generalization method and hypothesis method. In this paper, we adopt generalization/hypothesis mixed method to prove that pre-edited ANN is asymptotically optimal in probability. Furthermore, a simulation is presented to provide an experimental support for our theoretical work.

[1]  Marco Saerens,et al.  Building cost functions minimizing to some summary statistics , 2000, IEEE Trans. Neural Networks Learn. Syst..

[2]  Qingren Wang,et al.  STATISTICALLY EQUIVALENT BLOCK (SEB) CLASSIFIER: ITS ASYMPTOTIC PROPERTY AND PROSPECTS , 1986 .

[3]  Padhraic Smyth,et al.  On loss functions which minimize to conditional expected values and posterior proba- bilities , 1993, IEEE Trans. Inf. Theory.

[4]  L. K. Hansen,et al.  On comparison of adaptive regularization methods , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[5]  Li Hong,et al.  An Improved SVM: NN-SVM , 2003 .

[6]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[7]  Francesc J. Ferri,et al.  Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[8]  Terry J. Wagner Convergence of the edited nearest neighbor (Corresp.) , 1973, IEEE Trans. Inf. Theory.

[9]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[10]  J. Van Ryzin,et al.  A Simple Histogram Method for Nonparametric Classification , 1977 .

[11]  Jesús Cid-Sueiro,et al.  Cost functions to estimate a posteriori probabilities in multiclass problems , 1999, IEEE Trans. Neural Networks.

[12]  Kai Wang,et al.  An Expanded Training Set Based Validation Method to Avoid Overfitting for Neural Network Classifier , 2008, 2008 Fourth International Conference on Natural Computation.

[13]  Peter Rockett,et al.  The training of neural classifiers with condensed datasets , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[14]  Hermann Ney,et al.  On the Probabilistic Interpretation of Neural Network Classifiers and Discriminative Training Criteria , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Dimitris A. Pados,et al.  On overfitting, generalization, and randomly expanded training sets , 2000, IEEE Trans. Neural Networks Learn. Syst..

[16]  Patrick A. Shoemaker,et al.  A note on least-squares learning procedures and classification by neural network models , 1991, IEEE Trans. Neural Networks.

[17]  Lutz Prechelt,et al.  Automatic early stopping using cross validation: quantifying the criteria , 1998, Neural Networks.

[18]  K. Hara,et al.  A training data selection in on-line training for multilayer neural networks , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[19]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[20]  Steven Young,et al.  CARVE — A Constructive Algorithm for Real Valued Examples , 1994 .

[21]  H. Gish,et al.  A probabilistic approach to the understanding and training of neural network classifiers , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[22]  Chuan Wang,et al.  Training neural networks with additive noise in the desired signal , 1999, IEEE Trans. Neural Networks.

[23]  Amro El-Jaroudi,et al.  A new error criterion for posterior probability estimation with neural nets , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[24]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[25]  C. J. Stone,et al.  Consistent Nonparametric Regression , 1977 .

[26]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Barak A. Pearlmutter,et al.  Equivalence Proofs for Multi-Layer Perceptron Classifiers and the Bayesian Discriminant Function , 1991 .

[28]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[29]  Q Wang A CNN CLASSIFICATION DESIGN WITH BOUNDARY PATCHING , 1988 .

[30]  Eric A. Wan,et al.  Neural network classification: a Bayesian interpretation , 1990, IEEE Trans. Neural Networks.