A fast search algorithm for background music signals based on the search for numerous small signal components

This paper proposes a method for detecting and locating a known music signal in a long audio stream. Unlike existing methods, ours assumes that the music is used as background music (BGM) and overlapped by another sound such as speech and that the interfering sound is typically louder than the target music. The proposed method is based on time-series active search, which is a quick signal search method reported earlier. To realize the BGM search, however, a novel extension is introduced. That is, the music signal is firstly decomposed into a number of small time-frequency regions, and the search is carried out for each of those components. The results of the search are then integrated based on a voting scheme to find the target music locations. Experiments show that accurate search is possible when SNR is -5 dB and that the search completes in about 8 s for a 30-m stored signal.

[1]  Kunio Kashino,et al.  Fast music retrieval using polyphonic binary feature vectors , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[2]  Kunio Kashino,et al.  Feature fluctuation absorption for a quick audio retrieval from long recordings , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[3]  Kunio Kashino,et al.  Time-series active search for quick retrieval of audio and video , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  Naoko Kosugi,et al.  Music retrieval by humming-using similarity retrieval over high dimensional feature vector space , 1999, 1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368).

[5]  Ian H. Witten,et al.  Sequence-based melodic comparison: a dynamic programming approach , 1998 .

[6]  Kunio Kashino,et al.  A method for robust and quick video searching using probabilistic dither-voting , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[7]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[8]  Mototsugu Abe,et al.  Self-optimized spectral correlation method for background music identification , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.