A summation algorithm for MPEG-1 coded audio signals: a first step towards audio processing in the compressed domain

In this paper, we propose a new summation method for MPEG-1 encoded audio signals in the compressed domain. This method operates directly in the subband domain in order to reduce the delay and the implementation complexity. The main problem to be solved deals with bit allocation and subsequent quantization of the resulting summation signal. The bit allocation of encoded mpeg signals is based on the signal-to-mask ratios of the individual signals. To estimate the signal-to-mask ratios of the combined signal, the algorithm simply needs the information contained in the individual encoded frames. Along with a description of the proposed algorithm, variations and applications of the algorithm are detailed. Additionally, a performance evaluation of this method compared to direct summation in the time domain is given.RésuméCet article présente une nouvelle méthode de sommation des signaux audio codés mpeg-1 dans le domaine compressé. Elle se propose d’effectuer un traitement direct dans le domaine des sous-bandes dans le but de réduire le délai et la complexité algorithmique dus aux bancs de filtres. Le problème essentiel concernant cette approche est lié à l’allocation des bits pour le signal somme. Ainsi, l’algorithme propose d’utiliser seulement les informations contenues dans les trames individuelles pour estimer les rapports signal à masque du nouveau signal. D’autres variantes de l’algorithme sont aussi proposées. Une évaluation des performances par rapport à la sommation directe dans le domaine temporel est finalement donnée.

[1]  M. Alexander Broadhead,et al.  Direct manipulation of MPEG compressed digital audio , 1995, MULTIMEDIA '95.

[2]  Mohammed Ghanbari,et al.  A frequency-domain video transcoder for dynamic bit-rate reduction of MPEG-2 bit streams , 1998, IEEE Trans. Circuits Syst. Video Technol..

[3]  Gerhard Stoll,et al.  ISO-MPEG-1 Audio: A Generic Standard for Coding of High-: Quality Digital Audio , 1994 .

[4]  L. Humes,et al.  Models of the additivity of masking. , 1989, The Journal of the Acoustical Society of America.

[5]  Xiang Wei,et al.  Optimum bit allocation and decomposition for high quality audio coding , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Lidia W. Lee,et al.  Two experiments on the spectral boundary conditions for nonlinear additivity of simultaneous masking. , 1992, The Journal of the Acoustical Society of America.

[7]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[8]  J. Jullien,et al.  Paramètres techniques pour la qualité audiovisuelle , 1998 .

[9]  Shih-Fu Chang,et al.  Manipulation and Compositing of MC-DCT Compressed Video , 1995, IEEE J. Sel. Areas Commun..

[10]  Tomonori Aoyama,et al.  Multimedia multipoint teleconference system using the 7 kHz audio coding standard at 64 kbit/s , 1988, IEEE J. Sel. Areas Commun..

[11]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[12]  Gérard Faucon,et al.  A Perceptual Objective Measurement System (POM) for the Quality Assessment of Perceptual Codecs , 1994 .

[13]  Kyeong Ho Yang,et al.  CIF-to-QCIF video bitstream down-conversion in the DCT domain , 1998, Bell Labs Technical Journal.