论文信息 - CombNET-III: A Support Vector Machine Based Large Scale Classifier with Probabilistic Framework

CombNET-III: A Support Vector Machine Based Large Scale Classifier with Probabilistic Framework

Several research fields have to deal with very large classification problems, e.g. handwritten character recognition and speech recognition. Many works have proposed methods to address problems with large number of samples, but few works have been done concerning problems with large numbers of classes. CombNET-II was one of the first methods proposed for such a kind of task. It consists of a sequential clustering VQ based gating network (stem network) and several Multilayer Perceptron (MLP) based expert classifiers (branch networks). With the objectives of increasing the classification accuracy and providing a more flexible model, this paper proposes a new model based on the CombNET-II structure, the CombNET-III. The new model, intended for, but not limited to, problems with large number of classes, replaces the branch networks MLP with multiclass Support Vector Machines (SVM). It also introduces a new probabilistic framework that outputs posterior class probabilities, enabling the model to be applied in different scenarios (e.g. together with Hidden Markov Models). These changes permit the use of a larger number of smaller clusters, which reduce the complexity of the final classifiers. Moreover, the use of binary SVM with probabilistic outputs and a probabilistic decoding scheme permit the use of a pairwise output encoding on the branch networks, which reduces the computational complexity of the training stage. The experimental results show that the proposed model outperforms both the previous model CombNET-II and a single multiclass SVM, while presenting considerably smaller complexity than the latter. It is also confirmed that CombNET-III classification accuracy scales better with the increasing number of clusters, in comparison with CombNET-II.

[1] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[2] John Scott Bridle,et al. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[3] Nei Kato,et al. High Accuracy Recognition of ETL9B Using Exclusive Learning Neural Network-II : ELNET-II (Special Issue on Character Recognition and Document Understanding) , 1996 .

[4] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[5] Robert Tibshirani,et al. Classification by Pairwise Coupling , 1997, NIPS.

[6] Luiz Eduardo Soares de Oliveira,et al. Support vector machines for handwritten numerical string recognition , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[7] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[8] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[9] Samy Bengio,et al. Scaling Large Learning Problems with Hard Parallel Mixtures , 2002, Int. J. Pattern Recognit. Artif. Intell..

[10] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[11] James T. Kwok. Support vector mixture for classification and regression problems , 1998, ICPR.

[12] Paolo Frasconi,et al. New results on error correcting output codes of kernel machines , 2004, IEEE Transactions on Neural Networks.

[13] Hiroshi Matsuo,et al. A large scale neural network 'CombNET-II' , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[14] Nei Kato,et al. High Speed and High Accuracy Rough Classification for Handwritten Characters Using Hierarchical Learning Vector Quantization , 2000 .

[15] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[16] Christian Pellegrini,et al. Local experts combination through density decomposition , 1999, AISTATS.

[17] Samy Bengio,et al. A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[18] Hidefumi Kobatake,et al. A neural network with multiple large-scale subnetworks and its application to recognition of handwritten characters , 1999, Systems and Computers in Japan.