Compressed Domain Perceptual Hashing for MELP Coded Speech

A speech perceptual hash algorithm in compression domain is proposed in this paper. Speech coded at very low bitrate requires hash algorithm with high compactness and robustness. LSF could model the changing shape of the speaker vocal tract and is the intermediate result of partial decoding. They are used to generate hash value. The proposed method satisfies the robustness and discrimination requirement of perceptual hash with very low hash bitrate. It is also a computational efficient algorithm which could be applied to scenarios with power restriction or real-time communication requirement.

[1]  Ton Kalker,et al.  A Highly Robust Audio Fingerprinting System , 2002, ISMIR.

[2]  C.-C. Jay Kuo,et al.  Speech content authentication integrated with CELP speech decoders , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[3]  Ton Kalker,et al.  A Highly Robust Audio Fingerprinting System With an Efficient Search Strategy , 2003 .

[4]  A. McCree,et al.  A 1.6 kb/s MELP coder for wireless communications , 1997, 1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speech Coding.

[5]  Pedro Cano,et al.  A review of algorithms for audio fingerprinting , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[6]  John S. Collura,et al.  MELP: the new Federal Standard at 2400 bps , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.