Hilbert spectral analysis of vowels using intrinsic mode functions

In recent work, we presented mathematical theory and algorithms for time-frequency analysis of non-stationary signals. In that work, we generalized the definition of the Hilbert spectrum by using a superposition of complex AM-FM components parameterized by the Instantaneous Amplitude (IA) and Instantaneous Frequency (IF). Using our Hilbert Spectral Analysis (HSA) approach, the IA and IF estimates can be far more accurate at revealing underlying signal structure than prior approaches to time-frequency analysis. In this paper, we have applied HSA to speech and compared to both narrowband and wideband spectrograms. We demonstrate how the AM-FM components, assumed to be intrinsic mode functions, align well with the energy concentrations of the spectrograms and highlight fine structure present in the Hilbert spectrum. As an example, we show never before seen intra-glottal pulse phenomena that are not readily apparent in other analyses. Such fine-scale analyses may have application in speech-based medical diagnosis and automatic speech recognition (ASR) for pathological speakers.

[1]  Norden E. Huang,et al.  Ensemble Empirical Mode Decomposition: a Noise-Assisted Data Analysis Method , 2009, Adv. Data Sci. Adapt. Anal..

[2]  Steven Sandoval,et al.  Theory of the Hilbert Spectrum , 2015, 1504.07554.

[3]  Abdel-Ouahab Boudraa,et al.  Instantaneous frequency estimation of FM signals by Ψ B -energy operator , 2011 .

[4]  Sophocles J. Orfanidis,et al.  Introduction to signal processing , 1995 .

[5]  Manuel Duarte Ortigueira,et al.  On the HHT, its problems, and some solutions , 2008 .

[6]  R. Schafer,et al.  What Is a Savitzky-Golay Filter? , 2022 .

[7]  C. Turchetti,et al.  Multicomponent AM-FM Demodulation: The State of the Art After the Development of the Iterated Hilbert Transform , 2007, 2007 IEEE International Conference on Signal Processing and Communications.

[8]  F. Salzenstein,et al.  IF estimation using empirical mode decomposition and nonlinear Teager energy operator , 2004, First International Symposium on Control, Communications and Signal Processing, 2004..

[9]  Douglas D. O'Shaughnessy Speech Communications: Human and Machine , 2012 .

[10]  Petros Maragos,et al.  Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..

[11]  R. Shankar Fundamentals of Physics: Mechanics, Relativity, and Thermodynamics , 2014 .

[12]  Petros Maragos,et al.  Speech analysis and synthesis using an AM-FM modulation model , 1997, Speech Commun..

[13]  Patrick Flandrin,et al.  A complete ensemble empirical mode decomposition with adaptive noise , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[15]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[16]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[17]  F. Taylor,et al.  Estimation of instantaneous frequency using the discrete Wigner distribution , 1990 .

[18]  Thomas F. Quatieri,et al.  AM-FM separation using auditory-motivated filters , 1997, IEEE Trans. Speech Audio Process..

[19]  James H. McClellan,et al.  Instantaneous frequency estimation using linear prediction with comparisons to the DESAs , 1996, IEEE Signal Processing Letters.

[20]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[21]  Michael Feldman,et al.  Non-linear system vibration analysis using Hilbert transform--I. Free vibration analysis method 'Freevib' , 1994 .

[22]  David Borland,et al.  Rainbow Color Map (Still) Considered Harmful , 2007, IEEE Computer Graphics and Applications.

[23]  Norden E. Huang,et al.  On Instantaneous Frequency , 2009, Adv. Data Sci. Adapt. Anal..

[24]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[25]  M. Niccoli,et al.  A more perceptual color palette for structure maps , 2012 .

[26]  Ramdas Kumaresan,et al.  On decomposing speech into modulated components , 2000, IEEE Trans. Speech Audio Process..

[27]  Giorgio Biagetti,et al.  Multicomponent AM–FM Representations: An Asymptotically Exact Approach , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Yannis Stylianou,et al.  Adaptive AM–FM Signal Decomposition With Application to Speech Analysis , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  James F. Kaiser,et al.  The use of a masking signal to improve empirical mode decomposition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[30]  Petros Maragos,et al.  AM-FM energy detection and separation in noise using multiband energy operators , 1993, IEEE Trans. Signal Process..

[31]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[32]  Lawrence E. Kinsler,et al.  Fundamentals of acoustics , 1950 .

[33]  John M. O'Toole,et al.  Time-Frequency Processing of Nonstationary Signals: Advanced TFD Design to Aid Diagnosis with Highlights from Medical Applications , 2013, IEEE Signal Processing Magazine.

[34]  J.B. Allen,et al.  A unified approach to short-time Fourier analysis and synthesis , 1977, Proceedings of the IEEE.

[35]  T. Strom On amplitude-weighted instantaneous frequencies , 1977 .

[36]  Michael Feldman,et al.  Hilbert Transform Applications in Mechanical Vibration: Feldman/Hilbert Transform Applications in Mechanical Vibration , 2011 .

[37]  Gabriel Rilling,et al.  on the Influence of Sampling on the Empirical Mode Decomposition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[38]  Ronald W. Schafer,et al.  What Is a Savitzky-Golay Filter? [Lecture Notes] , 2011, IEEE Signal Processing Magazine.