Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environments

In a real environment, sound recordings are commonly distorted by channel and background noise, and the performance of audio identification is mainly degraded by them. Recently, Philips introduced a robust and efficient audio fingerprinting scheme applying a differential (high-pass filtering) to the frequency-time sequence of the perceptual filter-bank energies. In practice, however, the robustness of the audio fingerprinting scheme is still important in a real environment. In this letter, we introduce alternatives to the frequency-temporal filtering combination for an extension method of Philips' audio fingerprinting scheme to achieve robustness to channel and background noise under the conditions of a real situation. Our experimental results show that the proposed filtering combination improves noise robustness in audio identification. Keywords ⎯ Music information retrieval, audio fingerprint, frequency filtering, temporal filtering.