Data embedding in speech signals using perceptual masking

In this paper, a data embedding technique for speech signals, exploiting the masking property of the human auditory system, is presented. The signal in the frequency domain is partitioned into subbands. The data embedding parameters of each subband are computed from the auditory masking threshold function and a channel noise estimate. Data embedding is performed by modifying the Discrete Hartley Transform (DHT) coefficients according to the principles of the Scalar Costa Scheme (SCS). A maximum likelihood detector is employed in the decoder for embedded-data presence detection and data-embedding quantization-step estimation. We demonstrate the proposed data embedding technique by simulation of data embedding in a speech signal transmitted over a telephone line. The demonstrated system achieves transparent data-embedding at the rate of 300 information bits/second with a bit-error-rate of approximately 10-4. The proposed technique outperforms spread spectrum (SS) based data-embedding techniques for speech signals.

[1]  Max H. M. Costa,et al.  Writing on dirty paper , 1983, IEEE Trans. Inf. Theory.

[2]  Douglas L. Jones,et al.  On computing the discrete Hartley transform , 1985, IEEE Trans. Acoust. Speech Signal Process..

[3]  Bernd Girod,et al.  Scalar Costa scheme for information embedding , 2003, IEEE Trans. Signal Process..

[4]  Qiang Cheng,et al.  Spread spectrum signaling for speech watermarking , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Gregory W. Wornell,et al.  Quantization index modulation: A class of provably good methods for digital watermarking and information embedding , 2001, IEEE Trans. Inf. Theory.

[6]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[7]  Ahmed H. Tewfik,et al.  Robust audio watermarking using perceptual masking , 1998, Signal Process..