Open-set Evolving Acoustic Scene Classification System

Most audio recognition/classification systems assume a static and closed-set model, where training and testing data are drawn from a prior distribution. However, in real-world audio recognition/classification problems, such a distribution is unknown, and training data is limited and incomplete at training time. As it is difficult to collect exhaustive training samples to train classifiers. Datasets at prediction time are evolving and the trained model must deal with an infinite number of unseen/unknown categories. Therefore, it is desired to have an open-set classifier that not only accurately classifies the known classes into their respective classes but also effectively identifies unknown samples and learns them. This paper introduces an open-set evolving audio classification technique, which can effectively recognize and learn unknown classes continuously in an unsupervised manner. The proposed method consists of several steps: a) recognizing sound signals and associating them with known classes while also being able to identify the unknown classes; b) detecting the hidden unknown classes among the rejected sound samples; c) learning those novel detected classes and updating the classifier. The experimental results illustrate the effectiveness of the developed approach in detecting unknown sound classes compared to extreme value machine (EVM) and Weibull-calibrated SVM (W-SVM).

[1]  Stephen McAdams,et al.  Recognition of sound sources and events , 1993 .

[2]  Q. Summerfield Book Review: Auditory Scene Analysis: The Perceptual Organization of Sound , 1992 .

[3]  Songcan Chen,et al.  Recent Advances in Open Set Recognition: A Survey , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Amit Banerjee,et al.  A support vector method for anomaly detection in hyperspectral imagery , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Mark D. Plumbley,et al.  Introduction to sound scene and event analysis , 2018 .

[6]  Musaed Alhussein,et al.  Automatic Scene Recognition through Acoustic Classification for Behavioral Robotics , 2019, Electronics.

[7]  Terrance E. Boult,et al.  The Extreme Value Machine , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Nasser Kehtarnavaz,et al.  Real-Time Unsupervised Classification of Environmental Noise Signals , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9]  Marian Verhelst,et al.  The SINS Database for Detection of Daily Activities in a Home Environment Using an Acoustic Sensor Network , 2017, DCASE.

[10]  Anderson Rocha,et al.  Toward Open Set Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Nasser Kehtarnavaz,et al.  Smartphone-based noise adaptive speech enhancement for hearing aid applications , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[12]  Andrew Zisserman,et al.  Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Terrance E. Boult,et al.  Probability Models for Open Set Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Nicholas W. D. Evans,et al.  A one-class classification approach to generalised speaker verification spoofing countermeasures using local binary patterns , 2013, 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[15]  Nicholas W. D. Evans,et al.  The open-set problem in acoustic scene classification , 2016, 2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC).

[16]  Aren Jansen,et al.  Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Terrance E. Boult,et al.  Multi-class Open Set Recognition Using Probability of Inclusion , 2014, ECCV.

[18]  Tuomas Virtanen,et al.  TUT database for acoustic scene classification and sound event detection , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[19]  Justin Salamon,et al.  Look, Listen, and Learn More: Design Choices for Deep Audio Embeddings , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Harry Wechsler,et al.  Open set face recognition using transduction , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  P. Jonathon Phillips,et al.  Evaluation Methods in Face Recognition , 2011, Handbook of Face Recognition.

[22]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .