Musical Signal Type Discrimination based on Large Open Feature Sets

Automatic discrimination of musical signal types as speech, singing, music, genres or drumbeats within audio streams is of great importance, e.g. for radio broadcast stream segmentation. Yet, feature sets are largely discussed. We therefore suggest a large open feature set approach starting with systematical generation of 7k hi-level features based on MPEG-7 low-level-descriptors and further feature contours. A subsequent fast gain ratio reduction followed by wrapper-based floating search leads to a strong basis of relevant features. Next, features are added by alteration and combination within genetic search. For classification we use support-vector-machines proven reliable for this task. Test-runs are carried out on two task-specific databases and the public Columbia SMD database and show significant improvements for each step of the suggested novel concept

[1]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[2]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[4]  Björn W. Schuller,et al.  Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[5]  David Gerhard Pitch-based acoustic feature analysis for the discrimination of speech and monophonic singing , 2002 .

[6]  Björn W. Schuller,et al.  Feature Selection and Stacking for Robust Discrimination of Speech, Monophonic Singing, and Polyphonic Music , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[7]  Daniel P. W. Ellis,et al.  Speech/music discrimination based on posterior probability features , 1999, EUROSPEECH.

[8]  John Saunders,et al.  Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[9]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[10]  Björn W. Schuller,et al.  A hybrid music retrieval system using belief networks to integrate multimodal queries and contextual knowledge , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[11]  Daniel P. W. Ellis,et al.  Locating singing voice segments within music signals , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[12]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[13]  Liang Gu,et al.  Robust singing detection in speech/music discriminator design , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[14]  Ingo Mierswa Automatic Feature Extraction from Large Time Series , 2004, LWA.