Discriminative Learning for Minimum Error Classification

Recently, due to the advent of artificial neural networks and learning vector quantizers, there is a resurgent interest in reexamining the classical techniques of discriminant analysis to suit the new classifier structures. One of the particular problems of interest is minimum error classification in which the misclassification probability is to be minimized based on a given set of training samples. In this paper, we propose a new formulation for the minimum error classification problem, together with a fundamental technique for designing a classifier that approaches the objective of minimum classification error in a more direct manner than traditional methods. We contrast the new method to several traditional classifier designs in typical experiments to demonstrate the superiority of the new learning formulation. The method can be applied to other classifier structures as well. Experimental results pertaining to a speech recognition task are also provided to show the effectiveness of the new technique.

[1]  I. J. Schoenberg,et al.  The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.

[2]  J. K. Hawkins Self-Organizing Systems-A Review and Commentary , 1961, Proceedings of the IRE.

[3]  W. Highleyman Linear Decision Functions, with Application to Pattern Recognition , 1962, Proceedings of the IRE.

[4]  Nils J. Nilsson,et al.  Learning Machines: Foundations of Trainable Pattern-Classifying Systems , 1965 .

[5]  Baxter F. Womack,et al.  An Adaptive Pattern Classification System , 1966, IEEE Trans. Syst. Sci. Cybern..

[6]  Shun-ichi Amari,et al.  A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[7]  Aaron E. Rosenberg,et al.  On the use of instantaneous and transitional spectral information in speaker recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Biing-Hwang Juang,et al.  On the use of bandpass liftering in speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[9]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[10]  T. Kohonen,et al.  Statistical pattern recognition with neural networks: benchmarking studies , 1988, IEEE 1988 International Conference on Neural Networks.

[11]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[12]  Biing-Hwang Juang,et al.  New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.