One-Class SVMs Challenges in Audio Detection and Classification Applications

Support vector machines (SVMs) have gained great attention and have been used extensively and successfully in the field of sounds (events) recognition. However, the extension of SVMs to real-world signal processing applications is still an ongoing research topic. Our work consists of illustrating the potential of SVMs on recognizing impulsive audio signals belonging to a complex real-world dataset. We propose to apply optimized one-class support vector machines (1-SVMs) to tackle both sound detection and classification tasks in the sound recognition process. First, we propose an efficient and accurate approach for detecting events in a continuous audio stream. The proposed unsupervised sound detection method which does not require any pretrained models is based on the use of the exponential family model and 1-SVMs to approximate the generalized likelihood ratio. Then, we apply novel discriminative algorithms based on 1-SVMs with new dissimilarity measure in order to address a supervised sound-classification task. We compare the novel sound detection and classification methods with other popular approaches. The remarkable sound recognition results achieved in our experiments illustrate the potential of these methods and indicate that 1-SVMs are well suited for event-recognition tasks.

[1]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[2]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[3]  Arthur Gretton,et al.  An online support vector machine for abnormal events detection , 2006, Signal Process..

[4]  Mark J. F. Gales,et al.  Speech Recognition using SVMs , 2001, NIPS.

[5]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[6]  Mauro Cettolo,et al.  MODEL SELECTION CRITERIA FOR ACOUSTIC SEGMENTATION , 2001 .

[7]  Gunter Ritter,et al.  Outliers in statistical pattern recognition and an application to automatic chromosome classification , 1997, Pattern Recognit. Lett..

[8]  S. Mallat A wavelet tour of signal processing , 1998 .

[9]  N. Ellouze,et al.  SPEAKER CHANGE DETECTION METHOD EVALUATED ON ARABIC SPEECH CORPUS , 2006 .

[10]  Christian Wellekens,et al.  DISTBIC: A speaker-based segmentation for audio data indexing , 2000, Speech Commun..

[11]  Stéphane Canu,et al.  Estimation of Minimum Measure Sets in Reproducing Kernel Hilbert Spaces and Applications. , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12]  John C. Platt,et al.  Extracting noise-robust features from audio data , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[14]  Colin Campbell,et al.  A Linear Programming Approach to Novelty Detection , 2000, NIPS.

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[17]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[18]  Manuel Davy,et al.  An online kernel change detection algorithm , 2005, IEEE Transactions on Signal Processing.

[19]  Steve Renals,et al.  Evaluation of kernel methods for speaker verification and identification , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  P. Laguna,et al.  Signal Processing , 2002, Yearbook of Medical Informatics.

[21]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[22]  Alexander J. Smola,et al.  Kernel methods and the exponential family , 2006, ESANN.

[23]  M. M. Moya,et al.  One-class classifier networks for target recognition applications , 1993 .

[24]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  S. Chen,et al.  Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .

[26]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[27]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[28]  Nathalie Japkowicz,et al.  Concept learning in the absence of counterexamples: an autoassociation-based approach to classification , 1999 .

[29]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[30]  David M. J. Tax,et al.  One-class classification , 2001 .

[31]  Noureddine Ellouze,et al.  Sélection de descripteurs audio pour la classification des sons environnementaux avec des SVMs mono-classe , 2007 .

[32]  Simon J. Godsill,et al.  Detection of abrupt spectral changes using support vector machines an application to audio signal segmentation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33]  Joseph Picone,et al.  Support vector machines for speech recognition , 1998, ICSLP.

[34]  John H. L. Hansen,et al.  Unsupervised audio stream segmentation and clustering via the Bayesian information criterion , 2000, INTERSPEECH.

[35]  Asma Rabaoui,et al.  Using One-Class SVMs and Wavelets for Audio Surveillance , 2008, IEEE Transactions on Information Forensics and Security.

[36]  Alexander J. Smola,et al.  Exponential Families and Kernels , 2004 .

[37]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.

[38]  Michel Vacher,et al.  Détection et classification des sons : application aux sons de la vie courante et à la parole , 2005 .

[39]  Noureddine Ellouze,et al.  Improved one-class SVM classifier for sounds classification , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[40]  N. Ellouze,et al.  Using robust features with multi-class SVMs to classify noisy sounds , 2008, 2008 3rd International Symposium on Communications, Control and Signal Processing.

[41]  Til T. Phan,et al.  Text-Independent Speaker Identification , 1999 .

[42]  Takeshi Yamada,et al.  Voice activity detection using non-speech models and HMM composition , 2001 .

[43]  Jason Weston,et al.  Mismatch string kernels for discriminative protein classification , 2004, Bioinform..

[44]  Michel Vacher,et al.  Life Sounds Extraction and Classification in Noisy Environment , 2003, SIP.

[45]  Dan Istrate Détection et Reconnaissance des Sons pour la Surveillance Médicale. (Sound Detection and Classification for medical telemonitoring) , 2003 .

[46]  M. J. Cheng,et al.  Comparative performance study of several pitch detection algorithms , 1975 .