论文信息 - SVM classifiers for ASR: A discussion about parameterization

SVM classifiers for ASR: A discussion about parameterization

Automatic Speech Recognition (ASR) is essentially a problem of pattern classification, however, the time dimension of the speech signal has prevented to pose ASR as a simple static classification problem. Support Vector Machine (SVM) classifiers could provide an appropriate solution, since they are very well adapted to high-dimension classification problems. Nevertheless, the use of SVMs for ASR is by no means straightforward, because SVM classifiers require a fixed-dimension input. In this paper we propose and compare three alternatives for adapting the parameterization to the fixed-input dimension required by SVMs. We show that SVM classifiers outperforms the conventional HMM-based ASR system, when the speech signal is parameterised at properly selected instants.

[1] N. Cristianini,et al. On Kernel-Target Alignment , 2001, NIPS.

[2] Alexander J. Smola,et al. Learning with kernels , 1998 .

[3] Alex Waibel,et al. Continuous speech recognition using linked predictive neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4] Thomas G. Dietterich,et al. In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.

[5] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[6] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[7] Jason Weston,et al. Mismatch String Kernels for SVM Protein Classification , 2002, NIPS.

[8] Joseph Picone,et al. Support vector machines for speech recognition , 1998, ICSLP.

[9] Ken-ichi Iso,et al. Speaker-independent word recognition using a neural prediction model , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[10] Ken-ichi Iso,et al. Speaker-independent word recognition using dynamic programming neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[11] Yoram Singer,et al. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[12] Nello Cristianini,et al. Classification using String Kernels , 2000 .

[13] Mark J. F. Gales,et al. Using SVMS and discriminative models for speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14] Olivier Bousquet,et al. On the Complexity of Learning the Kernel Matrix , 2002, NIPS.

[15] Yoshua Bengio,et al. Neural networks for speech and sequence recognition , 1996 .

[16] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[17] Pedro J. Moreno,et al. On the use of support vector machines for phonetic classification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[18] Bernhard Schölkopf,et al. Learning with kernels , 2001 .