A Learning Algorithm of Boosting Kernel Discriminant Analysis for Pattern Recognition

In this paper, we present a new method to enhance classification performance of a multiple classifier system by combining a boosting technique called AdaBoost.M2 and Kernel Discriminant Analysis (KDA). To reduce the dependency between classifier outputs and to speed up the learning, each classifier is trained in a different feature space, which is obtained by applying KDA to a small set of hard-to-classify training samples. The training of the system is conducted based on AdaBoost.M2, and the classifiers are implemented by Radial Basis Function networks. To perform KDA at every boosting round in a realistic time scale, a new kernel selection method based on the class separability measure is proposed. Furthermore, a new criterion of the training convergence is also proposed to acquire good classification performance with fewer boosting rounds. To evaluate the proposed method, several experiments are carried out using standard evaluation datasets. The experimental results demonstrate that the proposed method can select an optimal kernel parameter more efficiently than the conventional cross-validation method, and that the training of boosting classifiers is terminated with a fairly small number of rounds to attain good classification accuracy. For multi-class classification problems, the proposed method outperforms both Boosting Linear Discriminant Analysis (BLDA) and Radial-Basis Function Network (RBFN) with regard to the classification accuracy. On the other hand, the performance evaluation for 2-class problems shows that the advantage of the proposed BKDA against BLDA and RBFN depends on the datasets.

[1]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[2]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[3]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[4]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[5]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[6]  Dit-Yan Yeung,et al.  Boosting Kernel Discriminant Analysis and Its Application on Tissue Classification of Gene Expression Data , 2007, IJCAI.

[7]  Jian-Huang Lai,et al.  Kernel subspace LDA with optimized kernel parameters on face recognition , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[8]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[9]  Ying Zhu,et al.  On the small sample performance of Boosted classifiers , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Gunnar Rätsch,et al.  Efficient Margin Maximizing with Boosting , 2005, J. Mach. Learn. Res..

[11]  Nello Cristianini,et al.  Dynamically Adapting Kernels in Support Vector Machines , 1998, NIPS.

[12]  Christophe Charrier,et al.  SVM training time reduction using vector quantization , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[13]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[14]  Shigeo Abe,et al.  Boosting Kernel Discriminant Analysis with Adaptive Kernel Selection , 2005 .

[15]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[16]  Bernhard Schölkopf,et al.  On a Kernel-Based Method for Pattern Recognition, Regression, Approximation, and Operator Inversion , 1998, Algorithmica.

[17]  John Mark,et al.  Introduction to radial basis function networks , 1996 .

[18]  Konstantinos N. Plataniotis,et al.  Boosting linear discriminant analysis for face recognition , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[19]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[20]  J. Nazuno Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[21]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[22]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[23]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[24]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[25]  L. Breiman Arcing Classifiers , 1998 .

[26]  Shigeo Abe,et al.  Support Vector Machines for Pattern Classification (Advances in Pattern Recognition) , 2005 .

[27]  Alejandro Murua,et al.  Upper Bounds for Error Rates of Linear Combinations of Classifiers , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Konstantinos N. Plataniotis,et al.  Ensemble-based discriminant learning with boosting for face recognition , 2006, IEEE Transactions on Neural Networks.

[29]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.