Neural and fuzzy computation techniques for playout delay adaptation in VoIP networks

Playout delay adaptation algorithms are often used in real time voice communication over packet-switched networks to counteract the effects of network jitter at the receiver. Whilst the conventional algorithms developed for silence-suppressed speech transmission focused on preserving the relative temporal structure of speech frames/packets within a talkspurt (intertalkspurt adaptation), more recently developed algorithms strive to achieve better quality by allowing for playout delay adaptation within a talkspurt (intratalkspurt adaptation). The adaptation algorithms, both intertalkspurt and intratalkspurt based, rely on short term estimations of the characteristics of network delay that would be experienced by up-coming voice packets. The use of novel neural networks and fuzzy systems as estimators of network delay characteristics are presented in this paper. Their performance is analyzed in comparison with a number of traditional techniques for both inter and intratalkspurt adaptation paradigms. The design of a novel fuzzy trend analyzer system (FTAS) for network delay trend analysis and its usage in intratalkspurt playout delay adaptation are presented in greater detail. The performance of the proposed mechanism is analyzed based on measured Internet delays.

[1]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[2]  Donald F. Towsley,et al.  Packet audio playout delay adjustment: performance bounds and algorithms , 1998, Multimedia Systems.

[3]  Maria C. Yuang,et al.  Intelligent voice smoother for silence-suppressed voice over Internet , 1998, ICC '98. 1998 IEEE International Conference on Communications. Conference Record. Affiliated with SUPERCOMM'98 (Cat. No.98CH36220).

[4]  Werner Verhelst,et al.  An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Henning Schulzrinne,et al.  Adaptive playout mechanisms for packetized audio applications in wide-area networks , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[6]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[7]  Prathima Agrawal,et al.  Delay Reduction Techniques for Playout Buffering , 2000, IEEE Trans. Multim..

[8]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[9]  P. K. Chaturvedi,et al.  Communication Systems , 2002, IFIP — The International Federation for Information Processing.

[10]  Mahbub Hassan,et al.  Internet telephony: services, technical challenges, and products , 2000, IEEE Commun. Mag..

[11]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[12]  Liam Kilmartin,et al.  Perceptual based analysis of the Concord algorithms for intra-talkspurt playout delay adaptation , 2004 .

[13]  V. Hardman,et al.  A survey of packet loss recovery techniques for streaming audio , 1998, IEEE Network.

[14]  Methods for objective and subjective assessment of quality Perceptual evaluation of speech quality ( PESQ ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs , 2002 .

[15]  JongWon Kim,et al.  Adaptive delay concealment for Internet voice applications with packet based time-scale modification , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[16]  Lee W. McKnight,et al.  Internet Telephony , 2001 .

[17]  Ah Chung Tsoi,et al.  FIR and IIR Synapses, a New Neural Network Architecture for Time Series Modeling , 1991, Neural Computation.

[18]  Bernd Girod,et al.  Adaptive playout scheduling using time-scale modification in packet voice communications , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[19]  Guido M. Schuster,et al.  Real-time voice over packet-switched networks , 1998, IEEE Netw..

[20]  Warren A. Montgomery,et al.  Techniques for Packet Voice Synchronization , 1983, IEEE J. Sel. Areas Commun..

[21]  Nikolaos Laoutaris,et al.  Intrastream synchronization for continuous media streams: a survey of playout schedulers , 2002 .

[22]  JongWon Kim,et al.  Quality Enhancement of Packet Audio with Time-Scale Modification , 2002, SPIE ITCom.

[23]  Ah Chung Tsoi,et al.  A unifying view of some training algorithms for multilayer perceptrons with FIR filter synapses , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[24]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[25]  Geoffrey E. Hinton,et al.  A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.

[26]  Ward Whitt,et al.  Characterizing Superposition Arrival Processes in Packet Multiplexers for Voice and Data , 1986, IEEE J. Sel. Areas Commun..