Relevancy of time-frequency features for phonetic classification measured by mutual information

In this paper we use mutual information to study the distribution in time and frequency of information relevant for phonetic classification. A large database of hand-labeled fluent speech is used to (a) compute the mutual information between phoneme labels and a point of logarithmic energy in the time-frequency plane and (b) compute the joint mutual information between phoneme labels and two points of logarithmic energy in the time-frequency plane.

[1]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[2]  Andreas S. Weigend,et al.  Nonparametric selection of input variables for connectionist learning , 1996 .

[3]  L. Goddard Information Theory , 1962, Nature.

[4]  Hynek Hermansky,et al.  Should recognizers have ears? , 1998, Speech Commun..

[5]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[6]  R. Cole,et al.  TELEPHONE SPEECH CORPUS DEVELOPMENT AT CSLU , 1998 .

[7]  Jeff A. Bilmes,et al.  Maximum mutual information based reduction strategies for cross-correlation based joint distributional modeling , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[9]  J. R. Koehler,et al.  Modern Applied Statistics with S-Plus. , 1996 .

[10]  G. Barrows,et al.  A mutual information measure for feature selection with application to pulse classification , 1996, Proceedings of Third International Symposium on Time-Frequency and Time-Scale Analysis (TFTS-96).

[11]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[12]  Jean-Luc Schwartz,et al.  An information theoretical investigation into the distribution of phonetic information across the auditory spectrogram , 1993, Comput. Speech Lang..