Scalable Audio Coding Using Trellis-Based Optimized Joint Entropy Coding and Quantization

There is a considerable performance gap between the current scalable audio coding schemes and a nonscalable coder operating at the same bitrate. This suboptimality results from the independent coding of the layers in these systems. One of the aspects that plays a role in this suboptimality is the entropy coding. In practical audio coding systems including MPEG advanced audio coding (AAC), the transform domain coefficients are quantized using an entropy-constrained quantizer. In MPEG-4 scalable AAC (S-AAC), the quantization and coding are performed separately at each layer. In case of Huffman coding, the redundancy introduced by the entropy coding at each layer is larger at lower quantization resolutions. Also, the redundancy for the overall coder becomes larger as the number of layers increases. In fact, there is a tradeoff between the overall redundancy and the fine-grain scalability in which the bitrate per layer is smaller and more layers are required. In this paper, a fine-grain scalable coder for audio signals is proposed where the entropy coding of a quantizer is made scalable via joint design of entropy coding and quantization. By constructing a Huffman-like coding tree where the internal nodes can be mapped to the reconstruction points, the tree can be pruned at any internal node to control the rate-distortion (RD) performance of the encoder in a fine-grain manner. A set of metrics and a trellis-based approach is proposed to create a coding tree so that an appropriate path is generated on the RD plane. The results show the proposed method outperforms the scalable audio coding performed based on reconstruction error quantization as used in practical systems, e.g., in S-AAC.

[1]  Faouzi Kossentini,et al.  Entropy-constrained residual vector quantization , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Kenneth Rose,et al.  A conditional enhancement-layer quantizer for the scalable MPEG advanced Audio Coder , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  S. Geneva,et al.  Sound Quality Assessment Material: Recordings for Subjective Tests , 1988 .

[4]  Kenneth Rose,et al.  Efficient bit-rate scalability for weighted squared error optimization in audio coding , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Kenneth Rose,et al.  Joint optimization of the perceptual core and lossless compression layers in scalable audio coding , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Gary J. Sullivan,et al.  Efficient scalar quantization of exponential and Laplacian random variables , 1996, IEEE Trans. Inf. Theory.

[7]  Kenneth Rose,et al.  A perceptually enhanced Scalable-to-Lossless audio coding scheme and a trellis-based approach for its optimization , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[8]  P. Noll,et al.  Bounds on Quantizer Performance in the Low Bit-Rate Region , 1978, IEEE Trans. Commun..

[9]  Teresa H. Meng,et al.  A perceptually based audio signal model with application to scalable audio compression , 1999 .

[10]  Susanto Rahardja,et al.  A fine granular scalable to lossless audio coder , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Allen Gersho,et al.  Constrained-storage quantization of multiple vector sources by codebook sharing , 1991, IEEE Trans. Commun..

[12]  Susanto Rahardja,et al.  Bit-plane Golomb coding for sources with Laplacian distributions , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  Daryl Ning,et al.  A bitstream scalable audio coder using a hybrid WLPC-wavelet representation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[14]  Ulug Bayazit,et al.  Variable-length constrained-storage tree-structured vector quantization , 1999, IEEE Trans. Image Process..

[15]  Dietrich Manstetten Tight bounds on the redundancy of Huffman codes , 1992, IEEE Trans. Inf. Theory.

[16]  Sugato Chakravarty,et al.  Method for the subjective assessment of intermedi-ate quality levels of coding systems , 2001 .

[17]  Karlheinz Brandenbrg,et al.  First Ideas on Scalable Audio Coding , 1994 .

[18]  Kenneth Rose,et al.  Approaches to Improve Quantization Performance Over the Scalable Advanced Audio Coder , 2002 .

[19]  Jung-Hoe Kim,et al.  Scalable Lossless Audio Coding Based on MPEG-4 BSAC , 2002 .

[20]  Kenneth Rose,et al.  Optimal Prediction in Scalable Coding of Stereophonic Audio , 2000 .

[21]  Susanto Rahardja,et al.  Bit-plane arithmetic coding for Laplacian source , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Kenneth Rose,et al.  Joint Optimization of Base and Enhancement Layers in Scalable Audio Coding , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[23]  Peter Kabal,et al.  Joint entropy-scalable coding of audio signals , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Teresa H. Y. Meng,et al.  A scalable entropy code , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[25]  Sang-Wook Kim,et al.  Fine Grain Scalability in MPEG-4 Audio , 2001 .

[26]  Syed A. Rizvi,et al.  Advances in residual vector quantization: a review , 1996, IEEE Trans. Image Process..

[27]  Kenneth Rose,et al.  Asymptotically optimal scalable coding for minimum weighted mean square error , 2001, Proceedings DCC 2001. Data Compression Conference.

[28]  Louis Dunn Fielder,et al.  ISO/IEC MPEG-2 Advanced Audio Coding , 1997 .

[29]  Kenneth Rose,et al.  Cross-Layer Rate-Distortion Optimization for Scalable Advanced Audio Coding , 2010 .