AM-FM Features and Their Application to Noise Robust Speech Recognition: A Review

The extraction and selection of the best parametric representation of acoustic signals is an important task in designing any speech recognition system. A wide range of possibilities exists for parametrically representing the speech signal for the speech recognition task such as Linear Prediction Coding (LPC), Mel Frequency Cepstrum Coefficients (MFCCs) and others. MFCCs are, currently, the most popular choice for any speech recognition system, though one of the shortcomings of MFCCs is that the signal is assumed to be stationary within the given time frame and is therefore unable to analyze non-stationary signal. To overcome this problem several researchers used different types of modulation/demodulation (AM-FM) techniques for extracting features from speech signal. In this paper, several techniques using the AM and FM model for a broadband signal such as speech and their use in feature extraction in speech recognition are outlined. Also, the use of Amplitude Modulation (AM), Frequency Modulation (FM), and modulation with Teager Energy Cepstral Coefficients (TECC) is studied.