On the effect of fundamental frequency on amplitude and frequency modulation patterns in speech resonances

Amplitude modulation (AM) and frequency modulation (FM) in speech signals are believed to reflect various non-linear phenomena during the speech production process. In this paper, the amplitude and frequency modulation patterns are analyzed for the first three speech resonances in relation to the fundamental frequency (F0). The formant tracks are estimated, and the resonant signals are extracted and demodulated. The Amplitude Modulation Index (AMI) and Frequency Modulation Index (FMI) are computed, and examined in relation to the F0 value, as well as the relation between F0 and the first formant value (F1). Both AMI and FMI are significantly affected by pitch, with modulations being more frequently present in low F0 conditions. Evidence of non-linear interaction between the glottal source and the vocal tract is found in the dependence of the modulation patterns on the ratio of F1 over F0. AMI is amplified when pitch harmonics coincide with F1, while FMI shows complementary behavior.

[1]  W. Fitch,et al.  Voice instabilities due to source-tract interactions , 2006 .

[2]  T. V. Ananthapadmanabha,et al.  Calculation of true glottal flow and its components , 1982, Speech Commun..

[3]  Petros Maragos,et al.  Robust AM-FM features for speech recognition , 2005, IEEE Signal Processing Letters.

[4]  Fred Cummins,et al.  Speaker Identification Using Instantaneous Frequencies , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Petros Maragos,et al.  Speech analysis and synthesis using an AM-FM modulation model , 1997, Speech Commun..

[6]  Alexandros Potamianos,et al.  Statistical analysis of amplitude modulation in speech signals using an AM-FM model , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  H. M. Teager,et al.  Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract , 1990 .

[8]  Douglas A. Reynolds,et al.  Fine structure features for speaker identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[9]  I. Titze Nonlinear source-filter coupling in phonation: theory. , 2008, The Journal of the Acoustical Society of America.

[10]  Luc Mongeau,et al.  Influence of acoustic loading on an effective single mass model of the vocal folds. , 2007, The Journal of the Acoustical Society of America.

[11]  Petros Maragos,et al.  On amplitude and frequency demodulation using energy operators , 1993, IEEE Trans. Signal Process..

[12]  Wei Zhao,et al.  Computational aeroacoustics of phonation, part I: Computational methods and sound generation mechanisms. , 2002, The Journal of the Acoustical Society of America.

[13]  Dimitrios Dimitriadis,et al.  Short-time instantaneous frequency and bandwidth features for speech recognition , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[14]  Petros Maragos,et al.  Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..

[15]  E. Chuang,et al.  Glottal characteristics of male speakers: acoustic correlates and comparison with female data. , 1996, The Journal of the Acoustical Society of America.