Maximum mutual information based acoustic-features representation of phonological features for speech recognition

This paper addresses the problem of finding a subset of the acoustic feature space that best represents a set of phonological features. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions coded by a set of correlated phonological features. Each set of phonological features is chosen on the basis of acoustic phonetic similarity, so the sets can be considered approximately independent. This means that the output of recognizers that recognize these sets independently using the acoustic representation achieved by an algorithm presented in this paper can be combined together to increase efficiency and robustness of speech recognition systems. The mutual information between the phonological feature sets and their achieved acoustic representation is increased by up to 220% over the best single-type acoustic representation in the feature space of the same length.

[1]  Mark Hasegawa-Johnson Time-frequency distribution of partial phonetic information measured using mutual information , 2000, INTERSPEECH.

[2]  Daniel P. W. Ellis,et al.  Using mutual information to design feature combinations , 2000, INTERSPEECH.

[3]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[4]  Gernot A. Fink,et al.  Conversational speech recognition using acoustic and articulatory input , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).