Compression of Model-based Group Delay Function for Robust Speech Recognition

In this paper, we improve the performance of the ARGDMF feature by adding a nonlinear filtering block. ARGDMF is a group delay-based feature consists of four main parts, namely autoregressive (AR) model extraction, group delay function (GDF) calculation, compression, and scale information augmentation. The main problem with the GDF is its spiky nature which is solved by coupling the GDF with an all-pole model. The compression step includes two stages similar to MFCC without taking a logarithm of the output energies. The fourth part augments the phase-based feature vector with scale information. The novelty of this paper is in adding a filtering block to compression process to make it more efficient. This filter aims at elevating the performance of the ARGDMF via a more optimum dynamic range and formants sharpness adjustment. The feature was evaluated on Aurora 2 database. In the presence of both additive and convolutional noises, the proposed method noticeably outperforms the MFCCs and other phase-based features, without remarkable increase in computational load.