A Novel Scheme for Low Bitrate Unified Speech and Audio Coding – MPEG RM0

Coding of speech signals at low bitrates, such as 16 kbps, has to rely on an efficient speech reproduction model to achieve reasonable speech quality. However, for audio signals not fitting to the model this approach generally fails. On the other hand, generic audio codecs, designed to handle any kind of audio signal, tend to show unsatisfactory results for speech signals, especially at low bitrates. To overcome this, a process was initiated by ISO/MPEG, aiming to standardize a new codec with consistent high quality for speech, music and mixed content over a broad range of bitrates. After a formal listening test evaluating several proposals MPEG has selected the best performing codec as the reference model for the standardization process. This paper describes this codec in detail and shows that the new reference model reaches the goal of consistent high quality for all signal types.

[1]  John Princen,et al.  Analysis/Synthesis filter bank design based on time domain aliasing cancellation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[2]  Kristofer Kjörling,et al.  Spectral Band Replication, a Novel Approach in Audio Coding , 2002 .

[3]  Sascha Disch,et al.  A harmonic bandwidth extension method for audio codecs , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Christof Faller,et al.  Spatial Audio Processing: MPEG Surround and Other Applications , 2007 .

[5]  Mark Dolson,et al.  The Phase Vocoder: A Tutorial , 1986 .

[6]  Bernd Edler,et al.  Improved Quantization and Lossless Coding for Subband Audio Coding , 2005 .

[7]  Sean A. Ramprashad The multimode transform predictive coding paradigm , 2003, IEEE Trans. Speech Audio Process..

[8]  Andreas Ehret,et al.  State-of-the-Art Audio Coding for Broadcasting and Mobile Applications , 2003 .

[9]  Khalid Sayood,et al.  Introduction to Data Compression , 1996 .

[10]  Philippe Gournay,et al.  Unified speech and audio coding scheme for high quality at low bitrates , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Hong-Goo Kang,et al.  Designing a unified speech/audio codec by adopting a single channel harmonic source separation module , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Pasi Ojala,et al.  AMR-WB+: a new audio coding standard for 3rd generation mobile audio services , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[13]  Sascha Disch,et al.  A Time-Warped MDCT Approach to Speech Transform Coding , 2009 .

[14]  Jürgen Herre,et al.  Extending the MPEG-4 AAC Codec by Perceptual Noise Substitution , 1998 .

[15]  Roch Lefebvre,et al.  A wideband speech and audio codec at 16/24/32 kbit/s using hybrid ACELP/TCX techniques , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[16]  V. T. Ruoppila,et al.  Combined speech and audio coding by discrimination , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[17]  Heiko Purnhagen,et al.  A Closer Look into MPEG-4 High Efficiency AAC , 2003 .

[18]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .