Unsupervised speech/music classification using one-class support vector machines

Audio classification is an important issue in current audio processing and content analysis researches. Speech/music classification is one of the most interesting branches of audio signal classification. In this paper we present an unsupervised clustering method, based on one-class support vector machines (OCSVM) and inspired by the classical K-means algorithm, which effectively classifies speech/music signals. First, relevant features are extracted from audio files. Then in an iterative K- means like algorithm, after initializing centers, each cluster is refined using a one-class support vector machine. The experimental results show that the clustering method, which can be easily implemented, performs better than other methods implemented on the same database.

[1]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[2]  K. R. Ramakrishnan,et al.  A Speech-Music Discriminator Using HILN Model Based Features , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Daniel P. W. Ellis,et al.  Speech/music discrimination based on posterior probability features , 1999, EUROSPEECH.

[4]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[5]  C.-C. Jay Kuo,et al.  Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[6]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[7]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[8]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[9]  Jhing-Fa Wang,et al.  Content-Based Audio Classification Using Support Vector Machines and Independent Component Analysis , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[10]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[11]  F. Lopez-Ferreras,et al.  Application of Fisher Linear Discriminant Analysis to Speech/Music Classification , 2006, EUROCON 2005 - The International Conference on "Computer as a Tool".

[12]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[13]  John Saunders,et al.  Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[14]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[15]  Francesco Camastra,et al.  A novel kernel method for clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.