Fragile speech watermarking scheme with recovering speech contents

The wide use of digital speech recorders becomes a serious matter when they are involved in assisting with court rulings. How to distinguish if a recorded content is valid or not becomes a life-or-death question. In light of this concern, least significant bits (LSB) of excitation signals would be used as fragile watermarks in the hybrid speech vocoder. In addition, a location-variable content-dependent watermark generating mechanism is proposed. Such location-variable content-based watermark would allow users to detect where in the recording the content is being replaced, inserted, or deleted. Lastly, an attempt is done to store partial reconstruction data in the LSBs of excitation signals in the G.723.1 speech codec, so that the original speech content may be reconstructed after counterfeited. The proposed system is demonstrated to be a reliable system, with test results showing that a recording with watermarks has a perceptual evaluation of speech quality (PESQ) value down 0.2, while the accuracy in detecting faked regions can be up to 97.45%.

[1]  C.-C. Jay Kuo,et al.  Fragile speech watermarking based on exponential scale quantization for tamper detection , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Andreas Spanias,et al.  Speech coding: a tutorial review , 1994, Proc. IEEE.

[3]  Oscal T.-C. Chen,et al.  A 0.75 kbps speech codec using recognition and synthesis schemes , 1997, 1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speech Coding.

[4]  C.-C.J. Kuo,et al.  Fragile speech watermarking for content integrity verification , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).