Robust voiced/unvoiced speech classification with self-organizing maps

The goal of this paper is to show the applicability of a new feature set in voiced/unvoiced (V/UV) classification of speech. The decision is based on the Kohonen-type Self-Organizing Maps (SOM) using this new feature set. The set of input features are computed according to the human auditory system using Warped Linear Prediction (WLP) and found to be robust to background noise - thus the classification is reliable for corrupted speech segments, too. Self-Organizing Maps classify noisy patterns with an error rate of less than 2% at 9 dB signal-to-noise ratio.

[1]  U. Laine,et al.  An orthogonal set of frequency and amplitude modulated (FAM) functions for variable resolution signal analysis , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Kenneth Steiglitz,et al.  Neural networks for voiced/unvoiced speech classification , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3]  R. P. Cohn Robust voiced/unvoiced speech classification using a neural net , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[5]  M. Karjalainen,et al.  DSP software integration by object-oriented programming: a case study of QuickSig , 1990, IEEE ASSP Magazine.

[6]  Unto K. Laine,et al.  Warped linear prediction (WLP) in speech and audio processing , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.