Reliability-based estimation of the number of noisy features: application to model-order selection in the union models

This paper is concerned with the multi-stream approach in speech recognition. In a given set of feature streams, there may be some features corrupted by noise. Ideally, these features should be excluded from recognition. To achieve this, a-priori knowledge about the identity, including both the number and location, of the noisy features is required. We present a method for estimating the number of noisy feature streams. This method assumes no knowledge about the noise. It is based on calculation of the reliability of each feature stream and then evaluation of the joint maximal reliability. Since this method decreases the uncertainty about the noisy features and is statistical in nature, it can also be used to increase robustness of other classification systems. We present an application of this method to model-order selection in the union models. We performed tests on the TIDIGITS database, corrupted by noises affecting various numbers of feature streams. The experimental results show that this model achieves recognition performance similar to the one obtained with a-priori knowledge about the identity of the corrupted features.

[1]  P Jancovic,et al.  A probabilistic union model with automatic order selection for noisy speech recognition. , 2001, The Journal of the Acoustical Society of America.

[2]  Richard Lippmann,et al.  Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise KN-37 , 1997, EUROSPEECH.

[3]  Lou Boves,et al.  Acoustic backing-off in the local distance computation for robust automatic speech recognition , 1998, ICSLP.

[4]  Andrzej Drygajlo,et al.  Speaker verification in noisy environments with combined spectral subtraction and missing feature theory , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5]  Hynek Hermansky,et al.  Sub-band based recognition of noisy speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Francis Jack Smith,et al.  A probabilistic union model for sub-band based robust speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[7]  Hervé Bourlard,et al.  A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.