Speech over VoIP Networks: Advanced Signal Processing and System Implementation

Speech communication using the Voice over Internet Protocol (VoIP) is very common today. The underlying network channel may be the public switched telephone network (PSTN channel), satellite channels or cellular wireless channels to name a few. The packetization of speech and its transmission through packet switched networks, however, introduce numerous impairments such as delay, jitter, packet loss and decoder clock offset, which degrade the quality of the speech. We present an overview of the challenges and a description of the advanced signal processing algorithms used to combat these impairments and render the perceived quality of a VoIP conversation to be as good as that of the existing telephone system. We also present an example of a speech coder designed for packet-switched networks and discuss the possibilities for hardware implementations.

[1]  Andreas Spanias Speech coding standards , 2001 .

[2]  Tokunbo Ogunfunmi,et al.  Performance enhanced multi-rate iLBC , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[3]  David J. Wright,et al.  Voice Over Packet Networks , 2001 .

[4]  Wai-Choong Wong,et al.  Waveform substitution techniques for recovering missing speech segments in packet voice communications , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  David J. Goodman,et al.  The effect of waveform substitution on the quality of PCM packet communications , 1988, IEEE Trans. Acoust. Speech Signal Process..

[6]  A. Stenger,et al.  A New Technique for Audio Packet Loss , 1996 .

[7]  José L. Núñez-Yáñez,et al.  Scalar coprocessors for accelerating the G723.1 and G729A speech coders , 2003, IEEE Trans. Consumer Electron..

[8]  Arjun Balaram,et al.  Efficient hardware-software co-design for the G.723.1 algorithm targeted at VoIP applications , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[9]  Tokunbo Ogunfunmi,et al.  Principles of Speech Coding , 2010 .

[10]  Gwo Giun Lee,et al.  Algorithm/Architecture Co-Exploration of Visual Computing on Emergent Platforms: Overview and Future Prospects , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Steve Leibson,et al.  Configurable processors: a new era in chip design , 2005, Computer.

[12]  Dake Liu,et al.  Instruction and hardware acceleration for MP-MLQ in G.723.1 , 2002, IEEE Workshop on Signal Processing Systems.

[13]  R. Crochiere,et al.  Speech Coding , 1979, IEEE Transactions on Communications.

[14]  N. Okumura,et al.  Design and Implementation of a Configurable Heterogeneous Multicore SoC With Nine CPUs and Two Matrix Processors , 2008, IEEE Journal of Solid-State Circuits.

[15]  Olivier Hersent IP Telephony: Deploying VoIP Protocols and IMS Infrastructure , 2010 .

[16]  Jan Skoglund,et al.  iLBC - a linear predictive coder with robustness to packet losses , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[17]  M. Mangoud,et al.  Speech Coding , 2005 .

[18]  Sekharjit Datta,et al.  Development of custom vector accelerator for high-performance speech coding , 2004 .

[19]  Sudipto Mukherjee,et al.  Voice over IP Fundamentals (2nd Edition) (Fundamentals) , 2006 .

[20]  Daniel Collins,et al.  Carrier Grade Voice Over IP , 2000 .

[21]  W. Kleijn,et al.  Enhancement of coded speech by constrained optimization , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[22]  Søren Vang Andersen,et al.  Real-time Transport Protocol (RTP) Payload Format for internet Low Bit Rate Codec (iLBC) Speech , 2004, RFC.

[23]  Daniel Minoli Voice Over IPv6: Architectures for Next Generation VoIP Networks , 2006 .

[24]  Tokunbo Ogunfunmi,et al.  Scalable multi-rate iLBC , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[25]  Paulo S. R. Diniz,et al.  Adaptive Filtering: Algorithms and Practical Implementation , 1997 .

[26]  Wai C. Chu,et al.  Speech Coding Algorithms , 2003 .

[27]  V. Hardman,et al.  A survey of packet loss recovery techniques for streaming audio , 1998, IEEE Network.

[28]  J.D. Gibson,et al.  Speech coding methods, standards, and applications , 2005, IEEE Circuits and Systems Magazine.

[29]  Tokunbo Ogunfunmi,et al.  Multi-rate ILBC using the DCT , 2010, 2010 IEEE Workshop On Signal Processing Systems.

[30]  Kuldip K. Paliwal,et al.  An Introduction to Speech Coding , 1995 .

[31]  Daniel Gajski,et al.  C-based design flow: A case study on G.729A for Voice over internet protocol (VoIP) , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[32]  Chaitali Chakrabarti,et al.  Signal processing on platforms with multiple cores: Part 1 - Overview and methodologies [From the Guest Editors] , 2009 .

[33]  Manohar N. Murthi,et al.  On Variable Rate Frame Independent Predictive Speech Coding: Re-Engineering ILBC , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[34]  Bernd Girod,et al.  Adaptive playout scheduling and loss concealment for voice communication over IP networks , 2003, IEEE Trans. Multim..

[35]  Tokunbo Ogunfunmi,et al.  Adaptive Nonlinear System Identification , 2007 .

[36]  Jan Skoglund,et al.  Voice over IP: Speech Transmission over Packet Networks , 2008 .

[37]  Ángel M. Gómez,et al.  A scalable coding scheme based on interframe dependency limitation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[38]  J. Bower A system-on-a-chip for audio encoding , 2004, 2004 International Symposium on System-on-Chip, 2004. Proceedings..

[39]  Henning Schulzrinne,et al.  Adaptive playout mechanisms for packetized audio applications in wide-area networks , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.