On using missing-feature theory with cepstral features - approximations to the multivariate integral

Missing Feature Theory (MFT), a powerful systematic framework for robust speech recognition, to date has not been optimally applied to linear-transform based features like MFCC or HLDA, which are necessary for state-of-the-art recognition accuracy, due to the intractable multivariate integral in bounded marginalization. This paper seeks to enable more optimal use of MFT with MFCC features/diagonal covariances through two approximations of this integral: Numeric integration by linear sampling, and approximation by the integrand’s maximum. The former is made feasible through a “tridiagonal” approximation of MFCC, based on interpreting MFCC as bandpass-filtering the filterbank vector. The latter is solved through quadratic programming. Their effectiveness is shown for recognizing reverberated TIMIT speech utilizing temporal auditory masking.