Progress in LPC-based frequency-domain audio coding

This paper describes the progress in frequency-domain linear prediction coding (LPC)-based audio coding schemes. Although LPC was originally used only for time-domain speech coders, it has been applied to frequency-domain coders since the late 1980s. With the progress in associated technologies, the frequency-domain LPC-based audio coding scheme has become more promising, and it has been used in speech/audio coding standards, such as MPEG-D unified speech and audio coding and 3GPP enhanced voice services since 2010. Three of the latest investigations on the representations of LPC envelopes in frequency-domain coders are shown. These are the harmonic model, frequency-resolution warping and the Powered All-Pole Spectral Envelope, all of which are aiming at further enhancement of the coding efficiency.

[1]  Keiichi Tokuda,et al.  A wideband CELP speech coder at 16 kbit/s based on mel-generalized cepstral analysis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Roch Lefebvre,et al.  Extended AMR-WB for high-quality audio on mobile devices , 2006, IEEE Communications Magazine.

[3]  Chong-Kwan Un,et al.  On Predictive Coding of Speech Signals , 1985 .

[4]  Zhe Wang,et al.  Overview of the EVS codec architecture , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Marc Antonini,et al.  Transform Audio Coding with Arithmetic-Coded Scalar Quantization and Model-Based Bit Allocation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Gerald Schuller,et al.  Frequency warping in low delay audio coding , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[8]  鎌本 優 Efficient lossless coding of multichannel signal based on time-space linear predictive model , 2012 .

[9]  Ralf Geiger,et al.  MDCT-based coder for highly adaptive speech and audio coding , 2009, 2009 17th European Signal Processing Conference.

[10]  Takehiro Moriya,et al.  Lossless Compression of Mapped Domain Linear Prediction Residual for ITU-T Recommendation G.711.0 , 2010, 2010 Data Compression Conference.

[11]  S. Hayashi,et al.  Design and description of CS-ACELP: a toll quality 8 kb/s speech coder , 1998, IEEE Trans. Speech Audio Process..

[12]  Keiichi Tokuda,et al.  Efficient encoding of mel-generalized cepstrum for CELP coders , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Sascha Disch,et al.  MPEG Unified Speech and Audio Coding-The ISO/MPEG Standard for High-Efficiency Audio Coding of All C , 2012 .

[14]  Solomon W. Golomb,et al.  Run-length encodings (Corresp.) , 1966, IEEE Trans. Inf. Theory.

[15]  R. W. Schafer,et al.  Lossless compression of digital audio , 2001, IEEE Signal Process. Mag..

[16]  B.S. Atal,et al.  Efficient search procedures for selecting the optimum innovation in stochastic coders , 1990, IEEE Trans. Acoust. Speech Signal Process..

[17]  Tomas Bäckström,et al.  Arithmetic coding of speech and audio spectra using tcx based on linear predictive spectral envelopes , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Hirokazu Kameoka,et al.  Optimal Coding of Generalized-Gaussian-Distributed Frequency Spectra for Low-Delay Audio Coder With Powered All-Pole Spectrum Estimation , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Giovanni Motta,et al.  Handbook of Data Compression , 2009 .

[20]  Bin Yu,et al.  Perceptual audio coding using adaptive pre- and post-filters and lossless compression , 2002, IEEE Trans. Speech Audio Process..

[21]  Susanto Rahardja,et al.  A statistics study of the MDCT coefficient distribution for audio , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[22]  K. Tokuda,et al.  Spectral estimation of speech by mel‐generalized cepstral analysis , 1993 .

[23]  Takao Kobayashi,et al.  A hardware implementation of a new narrow to medium band speech coding , 1982, ICASSP.

[24]  Hirokazu Kameoka,et al.  A Linear Predictive Coding Algorithm Minimizing the Golomb-Rice Code Length of the Residual Signal , 2008 .

[25]  Takehiro Moriya,et al.  Extension and complexity reduction of TwinVQ audio coder , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[26]  Yuriy A. Reznik Coding of prediction residual in MPEG-4 standard for lossless audio coding (MPEG-4 ALS) , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  P. Noll,et al.  Adaptive transform coding of speech signals , 1977 .

[28]  Ken D. Sauer,et al.  A generalized Gaussian image model for edge-preserving MAP estimation , 1993, IEEE Trans. Image Process..

[29]  Isabel Trancoso,et al.  Efficient procedures for finding the optimum innovation in stochastic coders , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  F. Itakura Line spectrum representation of linear predictor coefficients of speech signals , 1975 .

[32]  P. Mabilleau,et al.  Fast CELP coding based on algebraic codes , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33]  Hirokazu Kameoka,et al.  Resolution Warped Spectral Representation for Low-Delay and Low-Bit-Rate Audio Coder , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[34]  Takehiro Moriya,et al.  Transform coding of speech using a weighted vector quantizer , 1988, IEEE J. Sel. Areas Commun..

[35]  Roch Lefebvre,et al.  High quality coding of wideband audio signals using transform coded excitation (TCX) , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[36]  Takehiro Moriya,et al.  The MPEG-4 Audio Lossless Coding (ALS) Standard - Technology and Applications , 2005 .

[37]  Hirokazu Kameoka,et al.  Representation of spectral envelope with warped frequency resolution for audio coder , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[38]  Takehiro Moriya,et al.  Harmonic model for MDCT based audio coding with LPC envelope , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[39]  S. Golomb Run-length encodings. , 1966 .

[40]  Hirokazu Kameoka,et al.  Golomb-rice coding optimized via LPC for frequency domain audio coder , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[41]  B.S. Atal The history of linear prediction , 2006, IEEE Signal Processing Magazine.

[42]  Khalid Sayood Lossless Compression Handbook , 2003 .

[43]  Takehiro Moriya,et al.  Emerging ITU-T standard G.711.0 — lossless compression of G.711 pulse code modulation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[44]  Pierre Moulin,et al.  Analysis of Multiresolution Image Denoising Schemes Using Generalized Gaussian and Complexity Priors , 1999, IEEE Trans. Inf. Theory.

[45]  Tilman Liebchen,et al.  MPEG-4 ALS: an emerging standard for lossless audio coding , 2004, Data Compression Conference, 2004. Proceedings. DCC 2004.

[46]  Hiroshi Matsumoto,et al.  Low bit rate coding for speech and audio using mel linear predictive coding (MLPC) analysis , 1998, ICSLP.

[47]  Takehiro Moriya,et al.  High-quality audio coding at less than 64 kbit/s by using TwinVQ , 1995 .

[48]  H. Strube Linear prediction on a warped frequency scale , 1980 .

[49]  Schuyler R. Quackenbush MPEG Unified Speech and Audio Coding , 2013, IEEE MultiMedia.