Wideband speech and audio coding

Typical parameters of wideband speech and audio signals, including digitized versions of each, potential applications, and available transmission media, are described. Facts about human auditory perception that are exploited in audio coding and quality measures that play an important role in coder evaluations and designs are reviewed. Techniques for efficient coding of wideband speech and audio signals, with an emphasis on existing standards, are discussed. The audio coding standard developed by the Moving Pictures Expert Group within the International Organization for standardization (ISO/MPEG) is covered in some detail, since it will be used in many application areas, including digital storage, transmission, and broadcasting of audio-only signals and audiovisual applications such as videotelephony, videoconferencing, and TV broadcasting. Ongoing research and standardization work is outlined.<<ETX>>

[1]  Karlheinz Brandenburg,et al.  The iso/mpeg-audio codec: A generic standard for coding of high quality digital audio , 1992 .

[2]  Thomas Sporer,et al.  -NMR- and -Masking Flag-: Evaluation of Quality Using Perceptual Criteria , 1992 .

[3]  Yair Shoham,et al.  Coding of wideband speech , 1991, Speech Commun..

[4]  P. Mermelstein G.722: a new CCITT coding standard for digital transmission of wideband audio signals , 1988, IEEE Communications Magazine.

[5]  Bernd Edler Codierung von Audiosignalen mit überlappender Transformation und adaptiven Fensterfunktionen , 1989 .

[6]  J. D. Johnston,et al.  Sum-difference stereo transform coding , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  A.R.D. Thornton,et al.  Foundations of Modern Auditory Theory , 1970 .

[8]  Ernst F Schroeder,et al.  Aspec-Adaptive Spectral Entropy Coding of High Quality Music Signals , 1991 .

[9]  Yair Shoham,et al.  Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Günther Theile,et al.  The New Sound Format: 3/2-Stereo , 1993 .

[11]  Louis Dunn Fielder,et al.  High-quality audio transform coding at 128 kbits/s , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[12]  Akihiko Sugiyama,et al.  A 128 kb/s Hi-Fi Audio CODEC Based on Adaptive Transform Coding with Adaptive Block Size MDCT , 1992, IEEE J. Sel. Areas Commun..

[13]  John G. Beerends,et al.  A Perceptual Audio Quality Measure , 1992 .

[14]  G. Stoll,et al.  High quality audio bit-rate reduction system family for different applications , 1990, IEEE International Conference on Communications, Including Supercomm Technical Sessions.

[15]  Arild Fuldseth,et al.  Wideband speech coding at 16 kbit/s for a videophone application , 1992, Speech Commun..

[16]  Schuyler Quackenbush A 7 kHz bandwidth, 32 kbps speech coder for ISDN , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.