A fast search algorithm for background music signals based on the search for numerous small signal components

The paper proposes a method for detecting and locating a known music signal in a long audio stream. Unlike existing methods, ours assumes that the music is used as background music (BGM) and overlapped by another sound such as speech and that the interfering sound is typically louder than the target music. The proposed method is based on time-series active search, which is a quick signal search method reported earlier (Kashino, K. et al., Proc. ICASSP-99, vol.VI, 1999). To realize the BGM search, however, a novel extension is introduced. That is, the music signal is first decomposed into a number of small time-frequency regions, and the search is carried out for each of those components. The results of the search are then integrated based on a voting scheme to find the target music locations. Experiments show that an accurate search is possible when SNR is -5 dB and that the search completes in about 8 s for a 30 min stored signal.

[1]  Kunio Kashino,et al.  Fast music retrieval using polyphonic binary feature vectors , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[2]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[3]  Kunio Kashino,et al.  Feature fluctuation absorption for a quick audio retrieval from long recordings , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Kunio Kashino,et al.  A method for robust and quick video searching using probabilistic dither-voting , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[5]  Kunio Kashino,et al.  Time-series active search for quick retrieval of audio and video , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6]  Naoko Kosugi,et al.  Music retrieval by humming-using similarity retrieval over high dimensional feature vector space , 1999, 1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368).

[7]  Ian H. Witten,et al.  Sequence-based melodic comparison: a dynamic programming approach , 1998 .

[8]  Mototsugu Abe,et al.  Self-optimized spectral correlation method for background music identification , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.