Watermarking is a technique that consists in hiding/embedding binary information within a signal in an imperceptibly way, meaning in the present context of audio signals that the mark is inaudible. Watermarking was first used for the protection of digital contents as part of the DRM (Digital Rights Management). In this context of secured applications, important efforts were devoted to ensure robustness of watermarks against pirate attacks aiming at neutralizing it rather than improving the quantity of watermarked information; the bitrate was usually within the range of tens of bits per second bps for audio signals. Nowadays, audio watermarking can be used for other kinds of applications, and in particular for metadata transmission. However, bitrates are usually still quite low, although such applications require extended bitrates balanced with lower robustness. In this study we propose a high-capacity watermarking technique for audio signals. This technique is suitable for many uncompressed audio signals, more particularly for 16-bit Pulse Coded Modulation (PCM) signals as widely used in audio-CD and wav formats. The proposed technique is based on the application of the Quantization Index Modulation (QIM) technique on the MDCT (Modified Discrete Cosine Transform) coefficients of the signal. The underlying basic principle is that, if those coefficients can be significantly modified by quantization in audio compression schemes such as MPEG MP3/AAC without quality impairments, they can also be modified to embed watermark codes. Following audio compression principles, a psychoacoustic model (PAM) is used at the watermark embedder to take into consideration the behavior of the human auditory system and match the inaudibility constraint. The PAM is used to estimate an optimal watermarking capacity for each sub-band of each MDCT frame. The resulting capacity values are transmitted as (watermarked) side-information to the decoder (so that the decoder can retrieve the usefull watermarked information in the corresponding sub-band). For this aim, specific fixed capacities are allotted in the higher sub-band of the spectrum. With this technique, maximal bitrates of about 250kbps per audio channel can be reached (depending on the audio content), at the expense of robustness: the system can be used for “non-secure” applications where the signal suffers any attack other than quantization for uncompressed format conversion. For instance, we use this technique in a watermark-informed source separation system presented at the same congress.
[1]
Hugo Fastl,et al.
Psychoacoustics: Facts and Models
,
1990
.
[2]
Ingemar J. Cox,et al.
Watermarking as communications with side information
,
1999,
Proc. IEEE.
[3]
Laurent Girin,et al.
A watermarking-based method for single-channel audio source separation
,
2009,
2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4]
Ahmed H. Tewfik,et al.
Digital watermarks for audio signals
,
1996,
1996 8th European Signal Processing Conference (EUSIPCO 1996).
[5]
Olivier Derrien,et al.
Le codeur mpeg-2 AAC expliqué aux traiteurs de signaux
,
2000,
Ann. des Télécommunications.
[6]
William M. Hartmann,et al.
Psychoacoustics: Facts and Models
,
2001
.
[7]
Carl-Erik W. Sundberg,et al.
Digital audio broadcasting in the FM band by means of contiguous band insertion and precanceling techniques
,
2000,
IEEE Trans. Commun..
[8]
John Princen,et al.
Analysis/Synthesis filter bank design based on time domain aliasing cancellation
,
1986,
IEEE Trans. Acoust. Speech Signal Process..
[9]
H. Traunmüller.
Analytical expressions for the tonotopic sensory scale
,
1990
.
[10]
Laurent Girin,et al.
Informed source separation of underdetermined instantaneous stereo mixtures using source index embedding
,
2010,
2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11]
Max H. M. Costa,et al.
Writing on dirty paper
,
1983,
IEEE Trans. Inf. Theory.
[12]
Gregory W. Wornell,et al.
Quantization index modulation: A class of provably good methods for digital watermarking and information embedding
,
2001,
IEEE Trans. Inf. Theory.