Co-training Approach for Label-Minimized Audio Classification

Audio classification is an important preprocess to the audio data. However, lots of manual labeled data are needed for training models. In order to solve this problem, we evaluate a semi-supervised machine learning algorithm called co-training for content-based audio classification. The audio is divided into there classes: pure speech, pure music and speech mixed with music. We consider the audio features as views and minimize the labeled data quantity by using co-training algorithm. The experimental results on the VOA Special English show the effectiveness of the co-training algorithm for audio classification.

[1]  Tom Michael Mitchell,et al.  The Role of Unlabeled Data in Supervised Learning , 2004 .

[2]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[3]  Stan Z. Li,et al.  Content-based audio classification and retrieval using the nearest feature line method , 2000, IEEE Trans. Speech Audio Process..

[4]  Tsuhan Chen,et al.  Audio Feature Extraction and Analysis for Scene Segmentation and Classification , 1998, J. VLSI Signal Process..

[5]  Lu Jian,et al.  Automatic Audio Classification by Using Hidden Markov Model , 2002 .

[6]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[9]  Stan Matwin,et al.  Email classification with co-training , 2011, CASCON.

[10]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[11]  Zhang Le,et al.  Maximum Entropy Modeling Toolkit for Python and C , 2004 .

[12]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[13]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[14]  Stan Z. Li,et al.  Content-based Classification and Retrieval of Audio Using the Nearest Feature Line Method , 2000 .

[15]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[16]  Kamal Nigam,et al.  Understanding the Behavior of Co-training , 2000, KDD 2000.

[17]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Bai Lian Audio Classification and Segmentation Based on Support Vector Machines , 2005 .

[19]  Lie Lu,et al.  A robust audio classification and segmentation method , 2001, MULTIMEDIA '01.