Robustness of speech recognition using genetic algorithms and a Mel-cepstral subspace approach

The paper presents a method to compensate Mel-frequency cepstral coefficients (MFCCs) for a HMM-based speech recognition system evolving under telephone-channel degradations. The technique we propose is based on the combination of the Karhonen-Loeve transform (KLT) and genetic algorithms (GA). The idea consists of projecting the band-limited MFCCs onto a subspace generated by the genetically optimized KLT principal axes. Experiments show a clear improvement when the method is applied to the NTIMIT telephone speech database. Word recognition results obtained on the HTK toolkit platform using N-mixture triphone models and a bigram language model are presented and discussed.