Locating Segments with Drums in Music Signals

A system is described which segments musical signals according to the presence or absence of drum instruments. Two different yet approximately equally accurate approaches were taken to solve the problem. The first is based on periodicity detection in the amplitude envelopes of the signal at subbands. The band-wise periodicity estimates are aggregated into a summary autocorrelation function, the characteristics of which reveal the drums. The other mechanism applies straightforward acoustic pattern recognition approach with mel-frequency cepstrum coefficients as features and a Gaussian mixture model classifier. The integrated system achieves 88 % correct segmentation over a database of 28 hours of music from different musical genres. For the both methods, errors occur for borderline cases with soft percussive-like drum accompaniment, or transient-like instrumentation without drums.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  S. Schwerman,et al.  The Physics of Musical Instruments , 1991 .

[3]  Hagen Soltau,et al.  Recognition of music types , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Ma Conway,et al.  Handbook of perception and cognition , 1996 .

[5]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[6]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[7]  SIGNAL MODELING WITH SINUSOIDS PLUS NOISE , 2001 .

[8]  Ray Meddis,et al.  Virtual pitch and phase sensitivity of a computer model of the auditory periphery , 1991 .

[9]  Stephen Travis Pope,et al.  Musical Signal Processing , 1997 .

[10]  George Tzanetakis,et al.  Automatic Musical Genre Classification of Audio Signals , 2001, ISMIR.

[11]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[12]  Xavier Serra,et al.  Musical Sound Modeling with Sinusoids plus Noise , 1997 .

[13]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[14]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[15]  Ahmed H. Tewfik,et al.  Rhythm and periodicity detection in polyphonic music , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[16]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.