Wideband Speech Coding Standards and Applications

Increasing the bandwidth of sound signals from the telephone bandwidth of 200-3400 Hz to the wider bandwidth of 50-7000 Hz results in increased intelligibility and naturalness of speech and creates a feeling of transparent communication. Emerging end-to-end digital communication systems enable the use of wideband speech coding in numerous and diverse applications. In recognition of the need for high-quality wideband speech codecs, several standardization activities have been conducted recently, resulting in the selection of a new wideband speech codec, AMR-WB, at bit rates from 6.6 to 23.85 kbit/s by both 3GPP and ITU-T. The adoption of AMR-WB by the two bodies is of significant importance because for the first time the same codec will be adopted for wireless as well as wireline services. This will eliminate the need for transcoding and ease the implementation of wideband voice applications and services across a wide range of communication systems and equipment. This document presents a summary of wideband speech coding standards for wideband telephony applications. The quality advantages and applications of wideband speech coding are first presented, and then the issue of telephony over packet networks is discussed. Several wideband speech coding standards are discussed, and special emphasis is given to the AMR-WB standard recently selected by 3GPP and ITU-T. Introduction Most speech coding systems in use today are based on telephone-bandwidth narrowband speech, nominally limited to about 200-3400 Hz and sampled at a rate of 8 kHz. This limitation built into the conventional telephone system dates back to the first transcontinental telephone service established between New York and San Francisco in 1915. The inherent bandwidth limitations in the conventional public switched telephone network (PSTN) impose a limit on communication quality. The increasing penetration of the end-to-end digital networks, such as the second and third generation wireless systems, ISDN, and voice over packet networks, will permit the use of wider speech bandwidth that will offer communication quality that significantly surpasses that of the PSTN and creates the sensation of face-to-face communication. Most of the energy in speech signals is present below 7 kHz although it may extend to higher frequencies, particularly on unvoiced sounds. In wideband speech coding, the signal is sampled at 16 kHz,

[1]  S. Hayashi,et al.  Design and description of CS-ACELP: a toll quality 8 kb/s speech coder , 1998, IEEE Trans. Speech Audio Process..

[2]  Kari Jarvinen Standardisation of the adaptive multi-rate codec , 2000, 2000 10th European Signal Processing Conference.

[3]  Xavier Maitre,et al.  7 kHz audio coding within 64 kbit/s , 1988, IEEE J. Sel. Areas Commun..

[4]  METHODS FOR SUBJECTIVE DETERMINATION OF TRANSMISSION QUALITY Summary , 2022 .