Optimization of source and channel coding for voice over IP

Voice over Internet protocol (VoIP) applications must typically choose a tradeoff between the bits allocated for forward error correcting (FEC) and that for the source coding to achieve the best speech quality at a given packet loss rate. In this paper, we present a new scheme to optimize the speech quality subject to the bandwidth constraints and the packet loss rate. The scheme adopts adaptive multi-rate (AMR) speech codec along with a FEC scheme based on exclusive OR (XOR) operations. Retransmission is also taken into account if the round trip time (RTT) is within a certain limit. We use a simplified E-model as objective metric. Subjective listening tests show that our scheme improves the perceptual speech quality significantly compared to the non-adaptive baseline speech transmission system.

[1]  Donald F. Towsley,et al.  Adaptive FEC-based error control for Internet telephony , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[2]  Faouzi Kossentini,et al.  Efficient scalable DCT-based video coding at low bit rates , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[3]  Ravi Jain,et al.  A source and channel rate adaptation algorithm for AMR in VoIP using the Emodel , 2003, NOSSDAV '03.

[4]  Bo Li,et al.  An end-to-end approach for optimal mode selection in Internet video communication: theory and application , 2000, IEEE Journal on Selected Areas in Communications.

[5]  David W. Petr,et al.  Using optimization to achieve efficient quality of service in voice over IP networks , 2003, Conference Proceedings of the 2003 IEEE International Performance, Computing, and Communications Conference, 2003..

[6]  Wenyu Jiang,et al.  Comparison and optimization of packet loss repair methods on VoIP perceived quality under bursty loss , 2002, NOSSDAV '02.

[7]  Nobuhiko Kitawaki,et al.  Pure Delay Effects on Speech Quality in Telecommunications , 1991, IEEE J. Sel. Areas Commun..

[8]  Jungwoo Lee,et al.  Rate-distortion optimized frame type selection for MPEG encoding , 1997, IEEE Trans. Circuits Syst. Video Technol..

[9]  Yair Shoham,et al.  Efficient bit allocation for an arbitrary set of quantizers [speech coding] , 1988, IEEE Trans. Acoust. Speech Signal Process..

[10]  Ye Wang,et al.  A framework for robust and scalable audio streaming , 2004, MULTIMEDIA '04.