论文信息 - Unsupervised speech/music classification using one-class support vector machines

Unsupervised speech/music classification using one-class support vector machines

Audio classification is an important issue in current audio processing and content analysis researches. Speech/music classification is one of the most interesting branches of audio signal classification. In this paper we present an unsupervised clustering method, based on one-class support vector machines (OCSVM) and inspired by the classical K-means algorithm, which effectively classifies speech/music signals. First, relevant features are extracted from audio files. Then in an iterative K- means like algorithm, after initializing centers, each cluster is refined using a one-class support vector machine. The experimental results show that the clustering method, which can be easily implemented, performs better than other methods implemented on the same database.

[1] Robert P. W. Duin,et al. Support vector domain description , 1999, Pattern Recognit. Lett..

[2] K. R. Ramakrishnan,et al. A Speech-Music Discriminator Using HILN Model Based Features , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3] Daniel P. W. Ellis,et al. Speech/music discrimination based on posterior probability features , 1999, EUROSPEECH.

[4] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[5] C.-C. Jay Kuo,et al. Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[6] Hava T. Siegelmann,et al. Support Vector Clustering , 2002, J. Mach. Learn. Res..

[7] Jonathan Foote,et al. Content-based retrieval of music and audio , 1997, Other Conferences.

[8] Bernhard Schölkopf,et al. Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[9] Jhing-Fa Wang,et al. Content-Based Audio Classification Using Support Vector Machines and Independent Component Analysis , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[10] Douglas Keislar,et al. Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[11] F. Lopez-Ferreras,et al. Application of Fisher Linear Discriminant Analysis to Speech/Music Classification , 2006, EUROCON 2005 - The International Conference on "Computer as a Tool".

[12] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[13] John Saunders,et al. Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[14] Beth Logan,et al. Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[15] Francesco Camastra,et al. A novel kernel method for clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Malcolm Slaney,et al. Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.