A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss

This work compares the performance of three compensation methods for speech recognition in the presence of packet loss. Two methods, cubic interpolation and a novel maximum a posteriori (MAP) estimation, aim to restore the feature vector stream in the event of packet loss, while the third technique applies compensation in the decoding stage of recognition through missing feature theory. To improve performance in burst-like packet loss, interleaving is introduced to disperse bursts of loss. Experiments on the ETSI Aurora connected digit task show best performance to be given by a combination of missing feature theory and cubic interpolation. This raises performance from 50.3% to 69.8% at a packet loss rate of 50% and average burst length of 20 packets. Including interleaving further increases performance to over 76%.