Introduction to Digital Audio

This chapter describes the most important and prevailing technologies and algorithms of audio coding and standards activities. The three papers on audio coding present excellent overviews on the state of the art in speech and audio coding, its brief history, current research directions, and standards activities. The paper on psychoacoustics foundations includes audio perception, masking, and perceptual coding, and a paper on immersive audio systems, which have wide applications in virtual reality systems and the Internet. New algorithms and standards for audio coding are being developed quickly to meet the growing needs of diverse applications. As the research focus moves toward rates of 2.4 Kbps and below, waveform coding with the best CELP algorithms will face difficulties in meeting the ever higher quality objectives. Consequently, interest in vocoder studies is resurging as researchers focus on lower bit rates. On the other hand, wideband audio coding activities have been dominated by the work developed for the MPEG/Audio standards. The chapter also presents a novel desktop audio system with integrated listener-tracking capability that circumvents several of the technological limitations faced by today's digital audio workstations.

[1]  John Princen,et al.  Subband/Transform coding using filter bank designs based on time domain aliasing cancellation , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Karlheinz Brandenburg OCF--A new coding algorithm for high quality sound signals , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Thomas P. Barnwell,et al.  MCCREE AND BARNWELL MIXED EXCITAmON LPC VOCODER MODEL LPC SYNTHESIS FILTER 243 SYNTHESIZED SPEECH-PERIODIC PULSE TRAIN-1 PERIODIC POSITION JITTER PULSE 4 , 2004 .

[4]  Ira Alan Gerson,et al.  Vector Sum Excited Linear Prediction (VSELP) , 1991 .

[5]  Jae S. Lim,et al.  Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[6]  Yair Shoham High-quality speech coding at 2.4 to 4.0 kbit/s based on time-frequency interpolation , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Tomlinson Holman,et al.  Surrounded by sound , 1999 .

[8]  P. Noll,et al.  MPEG digital audio coding , 1997, IEEE Signal Process. Mag..

[9]  James D. Johnston MPEG-audio draft, description as of Dec. 10, 1990-ISO/IEC JTC1/SC2/WG11 , 1991, COMPCON Spring '91 Digest of Papers.

[10]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[11]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .

[12]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[13]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[14]  Louis Dunn Fielder,et al.  ISO/IEC MPEG-2 Advanced Audio Coding , 1997 .

[15]  W. Bastiaan Kleijn,et al.  Encoding speech using prototype waveforms , 1993, IEEE Trans. Speech Audio Process..