iLBC-Based Transparametrization: A Real Alternative to DSR for Speech Recognition Over Packet Networks

This paper proposes a method for the remote recognition of speech coded with the iLBC codec, which is employed by a number of VoIP systems. While the usual way of performing recognition of coded speech is to decode first the speech signal and use it as input to the recognition engine, our system directly converts the iLBC parameters into recognition features. The main advantage of this approach is to avoid any type of decoding post-processing which, although originally conceived to improve the speech perception, can be harmful for a recognition system. Our method ensures the compatibility between the speech spectra provided by the iLBC codec and those employed for cepstrum computation and introduces a robust and suitable packet loss concealment strategy. Our experimental results show that the proposed system achieves a performance better than that obtained from iLBC-decoded speech and similar to that of a distributed speech recognition system over a clean or degraded transmission channel.