论文信息 - Classification error from the theoretical Bayes classification risk

Classification error from the theoretical Bayes classification risk

This article shows that the Minimum Classification Error (MCE) criterion function commonly used for discriminative design of speech recognition systems is equivalent to a Parzen window based estimate of the theoretical Bayes classification risk. In this analysis, each training token is mapped to the center of a Parzen kernel in the domain of a suitably defined random variable. The kernels are summed to produce a density estimate; this estimate in turn can easily be integrated over the domain of incorrect classifications, yielding the risk estimate. The expression of risk for each kernel can be seen to correspond directly to the usual MCE loss function. The resulting risk estimate can be minimized by suitable adaptation of the recognition system parameters that determine the mapping from training token to kernel center. This analysis provides a novel link between the MCE empirical cost measured on a finite training set and the theoretical Bayes classification risk.

Shigeru Katagiri | Erik McDermott

[1] Alain Biem,et al. Discriminative training for large vocabulary telephone-based name recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3] Erik McDermott,et al. Discriminative Training for Speech Recognition , 1997 .

[4] Bernie Mulgrew,et al. IEEE Workshop on Neural Networks for Signal Processing , 1995 .

[5] Biing-Hwang Juang,et al. Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method , 1998, Proc. IEEE.

[6] Chin-Hui Lee,et al. Segmental GPD training of HMM based speech recognizer , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7] Biing-Hwang Juang,et al. New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.