Recovering of packet loss for Distributed Speech Recognition

This work deals with the packet loss problem in a Distributed Speech Recognition architecture. A packet loss simulation model is first proposed in order to simulate different channel degradation conditions. In these conditions, the performance of our continuous French speech recognition system is evaluated for packets containing different numbers of speech feature vectors. Several reconstruction strategies, to recover lost information, are proposed and evaluated. The results first confirm the intuitive fact that the word error rate obviously increases with the size of the lost packets and with the channel degradation level. However, it is shown that simple reconstruction strategies allow to recover acceptable performance. The most efficient ones are those using interleaving technique to distribute the speech information among packets, combined with interpolation methods to estimate lost acoustic features.

[1]  José Rouillard,et al.  Internet Documents: A Rich Source for Spoken Language Modeling , 1999 .

[2]  Donald F. Towsley,et al.  Measurement and modelling of the temporal dependence in packet loss , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[3]  Jean C. Bolot,et al.  The Case for FEC-based Error Control for Packet Audio in the Internet , 1997 .

[4]  Liang He,et al.  The study on distributed speech recognition system , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5]  V. Hardman,et al.  A survey of packet loss recovery techniques for streaming audio , 1998, IEEE Network.

[6]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[7]  Donald F. Towsley,et al.  Adaptive FEC-based error control for Internet telephony , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[8]  Ben P. Milner,et al.  Robust speech recognition over IP networks , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Carolyn Penstein Rosé,et al.  Recent Advances in JANUS: A Speech Translation System , 1993, TMI.