AbstractA perceptual audio coder, in which each audio segment is
adaptively analyzed using either a sinusoidal or an optimum wavelet basis
according to the time-varying characteristics of the audio signals, has been
constructed. The basis optimization is achieved by a novel switched filter
bank scheme, which switches between a uniform filter bank structure
(discrete cosine transform) and a non-uniform filter bank structure
(discrete wavelet transform). A major artifact of the International
ISO/Moving Pictures Experts Group (MPEG) audio coding standard (MPEG-I
layers 1 and 2) known as pre-echo distortion which uses a uniform filter bank structure for
audio signal analysis, is almost eliminated in the proposed coder. A
perceptual masking model implemented using a high-resolution wavelet packet
filter bank with 27 subbands, closely mimicking the critical bands
of the human auditory system, is employed in this audio coder. The resulting
scheme is a variable bit-rate audio coder, which provides compression ratios
comparable to MPEG-I layers 1 and 2 with almost transparent quality.
[1]
Jelena Kovacevic,et al.
Wavelets and Subband Coding
,
2013,
Prentice Hall Signal Processing Series.
[2]
A. Spanias,et al.
Perceptual coding of digital audio
,
2000,
Proceedings of the IEEE.
[3]
Hugo Fastl,et al.
Psychoacoustics: Facts and Models
,
1990
.
[4]
Leah H. Jamieson,et al.
High-quality audio compression using an adaptive wavelet packet decomposition and psychoacoustic modeling
,
1998,
IEEE Trans. Signal Process..
[5]
S. Mallat.
A wavelet tour of signal processing
,
1998
.
[6]
Deepen Sinha,et al.
Low bit rate transparent audio compression using adapted wavelets
,
1993,
IEEE Trans. Signal Process..