论文信息 - Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environments

Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environments

In a real environment, sound recordings are commonly distorted by channel and background noise, and the performance of audio identification is mainly degraded by them. Recently, Philips introduced a robust and efficient audio fingerprinting scheme applying a differential (high-pass filtering) to the frequency-time sequence of the perceptual filter-bank energies. In practice, however, the robustness of the audio fingerprinting scheme is still important in a real environment. In this letter, we introduce alternatives to the frequency-temporal filtering combination for an extension method of Philips' audio fingerprinting scheme to achieve robustness to channel and background noise under the conditions of a real situation. Our experimental results show that the proposed filtering combination improves noise robustness in audio identification. Keywords ⎯ Music information retrieval, audio fingerprint, frequency filtering, temporal filtering.

Hoirin Kim | Mansoo Park | Seung Hyun Yang

[1] Ramarathnam Venkatesan,et al. A Perceptual Audio Hashing Algorithm: A Tool for Robust Audio Identification and Information Hiding , 2001, Information Hiding.

[2] Ingemar J. Cox,et al. Audio fingerprinting: nearest neighbor search in high dimensional binary spaces , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[3] Ho-Young Jung,et al. Filtering of Filter‐Bank Energies for Robust Speech Recognition , 2004 .

[4] Hynek Hermansky,et al. Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP) , 1991, EUROSPEECH.

[5] Darko Kirovski,et al. Beat-ID: identifying music via beat analysis , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[6] Ton Kalker,et al. A Highly Robust Audio Fingerprinting System , 2002, ISMIR.

[7] John C. Platt,et al. Distortion discriminant analysis for audio fingerprinting , 2003, IEEE Trans. Speech Audio Process..

[8] Satoshi Nakamura,et al. Cepstrum derived from differentiated power spectrum for robust speech recognition , 2003, Speech Commun..

[9] Ingemar J. Cox,et al. Audio Fingerprinting: Nearest Neighbor Search in High Dimensional Binary Spaces , 2005, J. VLSI Signal Process..

[10] Climent Nadeu,et al. Time and frequency filtering of filter-bank energies for robust HMM speech recognition , 2000, Speech Commun..