Broadcast news audio classification using SVM binary trees

Audio classification is one of the most important task in content-based analysis and can be implemented in many audio applications, such as indexing and retrieving. This paper addresses the problem of broadcast news audio classification, by support vector machine - binary tree (SVM-BT) architecture, into the five classes: pure speech, speech with music, speech with environment sound, pure music and environment sound. One of the most substantial step in creating such classification architecture is selection of an optimal feature set for each binary SVM classifier. Therefore we implement F-score feature selection algorithm, as an effective search algorithm, within a space of characteristic features that is mostly used for speech/non-speech discrimination.

[1]  Jozef Juhár,et al.  Acoustic Events Detection Using MFCC and MPEG-7 Descriptors , 2011, MCSS.

[2]  M. A. Siegler,et al.  Automatic Segmentation, Classification and Clustering of Broadcast News Audio , 1997 .

[3]  Ralph van Dinther,et al.  Real-time segmentation of rad io broadcast content in radio devices , 2009, 2009 Digest of Technical Papers International Conference on Consumer Electronics.

[4]  Jhing-Fa Wang,et al.  Unsupervised speaker change detection using SVM training misclassification rate , 2007, IEEE Transactions on Computers.

[5]  Shigeo Abe,et al.  Support Vector Machines for Pattern Classification (Advances in Pattern Recognition) , 2005 .

[6]  C.-C. Jay Kuo,et al.  Hierarchical classification of audio data for archiving and retrieving , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  John H. L. Hansen,et al.  Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Chin Kim On,et al.  Mel-frequency cepstral coefficient analysis in speech recognition , 2006, 2006 International Conference on Computing & Informatics.

[10]  Shigeo Abe Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[11]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  C.-C. Jay Kuo,et al.  Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[13]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[14]  Michael A. Casey,et al.  General sound classification and similarity in MPEG-7 , 2001, Organised Sound.

[15]  Lie Lu,et al.  Digital Object Identifier (DOI) 10.1007/s00530-002-0065-0 Multimedia Systems , 2003 .

[16]  C.-C. Huang,et al.  Automatic scene change detection for composed speech and music sound under low SNR noisy environment , 2005, IEEE Transactions on Speech and Audio Processing.

[17]  Thomas Sikora,et al.  MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval , 2005 .

[18]  Xinbo Gao,et al.  Automatic News Audio Classification Based on Selective Ensemble SVMs , 2005, ISNN.

[19]  João Paulo da Silva Neto,et al.  The COST278 Pan-European Broadcast News Database , 2004, LREC.

[20]  Yong Luo,et al.  Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news , 2011, Multimedia Systems.

[21]  Gaël Richard,et al.  Comparison of different strategies for a SVM-based audio segmentation , 2009, 2009 17th European Signal Processing Conference.