Classification of Broadcast News Audio Data Employing Binary Decision Architecture

A novel binary decision architecture (BDA) for broadcast news audio classification task is presented in this paper. The idea of developing such architecture came from the fact that the appropriate combination of multiple binary classifiers for two-class discrimination problem can reduce a miss-classification error without rapid increase in computational complexity. The core element of classification architecture is represented by a binary decision (BD) algorithm that performs discrimination between each pair of acoustic classes, utilizing two types of decision functions. The first one is represented by a simple rule-based approach in which the final decision is made according to the value of selected discrimination parameter. The main advantage of this solution is relatively low processing time needed for classification of all acoustic classes. The cost for that is low classification accuracy. The second one employs support vector machine (SVM) classifier. In this case, the overall classification accuracy is conditioned by finding the optimal parameters for decision function resulting in higher computational complexity and better classification performance. The final form of proposed BDA is created by combining four BD discriminators supplemented by decision table. The effectiveness of proposed BDA, utilizing rule-based approach and the SVM classifier, is compared with two most popular strategies for multiclass classification, namely the binary decision trees (BDT) and the One-Against-One SVM (OAOSVM). Experimental results show that the proposed classification architecture can decrease the overall classification error in comparison with the BDT architecture. On the contrary, an optimization technique for selecting the optimal set of training data is needed in order to overcome the OAOSVM.

[1]  Sebastian Stüker,et al.  Segmentation of Telephone Speech Based on Speech and Non-speech Models , 2013, SPECOM.

[2]  J. Juhar,et al.  SVM binary decision tree architecture for multi-class audio classification , 2012, Proceedings ELMAR-2012.

[3]  Bin Wang,et al.  Text Classification Using Support Vector Machine with Mixture of Kernel , 2012 .

[4]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[5]  Lei Chen,et al.  Mixed Type Audio Classification with Support Vector Machine , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[6]  Jhing-Fa Wang,et al.  Environmental Sound Classification using Hybrid SVM/KNN Classifier and MPEG-7 Audio Low-Level Descriptor , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[7]  Adnan Yazici,et al.  Content-Based Classification and Segmentation of Mixed-Type Audio by Using MPEG-7 Features , 2009, 2009 First International Conference on Advances in Multimedia.

[8]  Guillaume Gravier,et al.  Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News , 2004, LREC.

[9]  Milos Cernak A Comparison of Decision Tree Classifiers for Automatic Diagnosis of Speech Recognition Errors , 2010, Comput. Informatics.

[10]  M. Rashidi,et al.  A New SVM-based Mix Audio Classification , 2008, 2008 40th Southeastern Symposium on System Theory (SSST).

[11]  Qiang Huang,et al.  SVM-Based Audio Classification for Content- Based Multimedia Retrieval , 2007, MCAM.

[12]  Chin Kim On,et al.  Mel-frequency cepstral coefficient analysis in speech recognition , 2006, 2006 International Conference on Computing & Informatics.

[13]  Matej Grasic,et al.  Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy , 2009, EURASIP J. Adv. Signal Process..

[14]  Milos Cernak,et al.  Effective Triphone Mapping for Acoustic Modeling in Speech Recognition , 2011, INTERSPEECH.

[15]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[16]  Wasfi G. Al-Khatib,et al.  Machine-learning based classification of speech and music , 2006, Multimedia Systems.

[17]  Soumya Priyadarsini Panda,et al.  A Rule-Based Concatenative Approach to Speech Synthesis in Indian Language Text-to-Speech Systems , 2015 .

[18]  Nikos Fakotakis,et al.  Automatic Sound Classification of Radio Broadcast News , 2012 .

[19]  Sayan Mukherjee,et al.  Classifying Microarray Data Using Support Vector Machines , 2003 .

[20]  Taras Butko,et al.  Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion , 2011, EURASIP J. Audio Speech Music. Process..

[21]  Bicheng Li,et al.  Audio classification based on SVM-UBM , 2008, 2008 9th International Conference on Signal Processing.

[22]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[23]  Eduardo Lleida,et al.  Audio segmentation-by-classification approach based on factor analysis in broadcast news domain , 2014, EURASIP J. Audio Speech Music. Process..

[24]  Damjan Vlaj,et al.  Acoustic classification and segmentation using modified spectral roll-off and variance-based features , 2013, Digit. Signal Process..

[25]  Björn W. Schuller,et al.  Feature Selection and Stacking for Robust Discrimination of Speech, Monophonic Singing, and Polyphonic Music , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[26]  Francesc Alías,et al.  Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification , 2012, IEEE Transactions on Multimedia.

[27]  Boaz Lerner,et al.  Support vector machine-based image classification for genetic syndrome diagnosis , 2005, Pattern Recognit. Lett..

[28]  Yong Luo,et al.  Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news , 2011, Multimedia Systems.

[29]  Benjamin Naumann,et al.  Learning And Soft Computing Support Vector Machines Neural Networks And Fuzzy Logic Models , 2016 .

[30]  Lie Lu,et al.  Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..

[31]  João Paulo da Silva Neto,et al.  Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data , 2008, PROPOR.

[32]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[33]  Tshilidzi Marwala,et al.  Image Classification Using SVMs: One-against-One Vs One-against-All , 2007, ArXiv.

[34]  Nima Mesgarani,et al.  Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[35]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[36]  Shigeo Abe Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[37]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[38]  V. Tiwari MFCC and its applications in speaker recognition , 2010 .

[39]  Dima Ruinskiy,et al.  A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation , 2009, EURASIP J. Audio Speech Music. Process..

[40]  P. Dhanalakshmi,et al.  Classification of audio signals using SVM and RBFNN , 2009, Expert Syst. Appl..

[41]  Jozef Vavrek,et al.  Broadcast news audio classification using SVM binary trees , 2012, 2012 35th International Conference on Telecommunications and Signal Processing (TSP).

[42]  Lie Lu,et al.  Content-based audio classification and segmentation by using support vector machines , 2003, Multimedia Systems.

[43]  S. Sathiya Keerthi,et al.  Which Is the Best Multiclass SVM Method? An Empirical Study , 2005, Multiple Classifier Systems.

[44]  Jozef Vavrek,et al.  Audio classification utilizing a rule-based approach and the support vector machine classifier , 2013, 2013 36th International Conference on Telecommunications and Signal Processing (TSP).

[45]  Matús Pleva,et al.  TUKE-BNews-SK: Slovak Broadcast News Corpus Construction and Evaluation , 2014, LREC.

[46]  Yu Song,et al.  Feature extraction and classification for audio information in news video , 2009, 2009 International Conference on Wavelet Analysis and Pattern Recognition.