Implementation of vocal tract length normalization for phoneme recognition on timit speech corpus

Inter-speaker variability, one of the problems faced in speech recognition system, has caused the performance degradation in recognizing varied speech spoken by different speakers. Vocal Tract Length Normalization (VTLN) method is known to improve the recognition performances by compensating the speech signal using specific warping factor. Experiments are conducted using TIMIT speech corpus and Hidden Markov Model Toolkit (HTK) together with the implementation of VTLN method in order to show improvement in speaker independent phoneme recognition. The results show better recognition performance using Bigram Language Model compared to Unigram Language Model, with Phoneme Error Rate (PER) 28.8% as the best recognition performance for Bigram and PER 38.09% for Unigram. The best warp factor used for normalization in this experiment is 1.40.