A fast linear-in-the-parameters classifier construction algorithm using orthogonal forward selection to minimize leave-one-out misclassification rate

We propose a simple and computationally efficient construction algorithm for two class linear-in-the-parameters classifiers. In order to optimize model generalization, a forward orthogonal selection (OFS) procedure is used for minimizing the leave-one-out (LOO) misclassification rate directly. An analytic formula and a set of forward recursive updating formula of the LOO misclassification rate are developed and applied in the proposed algorithm. Numerical examples are used to demonstrate that the proposed algorithm is an excellent alternative approach to construct sparse two class classifiers in terms of performance and computational efficiency.

[1]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[2]  Sheng Chen,et al.  Sparse modeling using orthogonal forward regression with PRESS statistic and regularization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[4]  H. Akaike A new look at the statistical model identification , 1974 .

[5]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[6]  Sheng Chen,et al.  Kernel Classifier Construction Using Orthogonal Forward Selection and Boosting With Fisher Ratio Class Separability Measure , 2006, IEEE Transactions on Neural Networks.

[7]  Pascal Vincent,et al.  Kernel Matching Pursuit , 2002, Machine Learning.

[8]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[9]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[10]  Sheng Chen,et al.  Orthogonal least squares methods and their application to non-linear system identification , 1989 .

[11]  Sheng Chen,et al.  A Kernel-Based Two-Class Classifier for Imbalanced Data Sets , 2007, IEEE Transactions on Neural Networks.

[12]  R. H. Myers Classical and modern regression with applications , 1986 .

[13]  Kezhi Mao,et al.  RBF neural network center selection based on Fisher ratio class separability measure , 2002, IEEE Trans. Neural Networks.

[14]  Xia Hong,et al.  Nonlinear model structure design and construction using orthogonal least squares and D-optimality design , 2002, IEEE Trans. Neural Networks.

[15]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[16]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[19]  Paul Sharkey,et al.  Automatic nonlinear predictive model-construction algorithm using forward regression and the PRESS statistic , 2003 .

[20]  Xia Hong,et al.  Construction of RBF Classifiers with Tunable Units using Orthogonal Forward Selection Based on Leave-One-Out Misclassification Rate , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.