Acoustic Scene Classification by using Combination of MODWPT and Spectral Features

Copyright © 2019 by author(s) and International Journal of Trend in Scientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by /4.0) ABSTRACT Acoustic Scene Classification (ASC) is classified audio signals to imply about the context of the recorded environment. Audio scene includes a mixture of background sound and a variety of sound events. In this paper, we present the combination of maximal overlap wavelet packet transform (MODWPT) level 5 and six sets of time domain and frequency domain features are energy entropy, short time energy, spectral roll off, spectral centroid, spectral flux and zero crossing rate over statistic values average and standard deviation. We used DCASE Challenge 2016 dataset to show the properties of machine learning classifiers. There are several classifiers to address the ASC task. We compare the properties of different classifiers: K-nearest neighbors (KNN), Support Vector Machine (SVM), and Ensembles Bagged Trees by using combining wavelet and spectral features. The best of classification methodology and feature extraction are essential for ASC task. In this system, we extract at level 5, MODWPT energy 32, relative energy 32 and statistic values 6 from the audio signal and then extracted feature is applied in different classifiers.

[1]  Urbano Nunes,et al.  Efficient feature selection for sleep staging based on maximal overlap discrete wavelet transform and SVM , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[2]  Mohammad Reza Daliri,et al.  Epileptic seizure classification using novel entropy features applied on maximal overlap discrete wavelet packet transform of EEG signals , 2017, 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE).

[3]  Zied Lachiri,et al.  Audio sounds classification using scattering features and support vectors machines for medical surveillance , 2018 .

[4]  Soo-Don Hyun,et al.  ACOUSTIC SCENE CLASSIFICATION USING PARALLEL COMBINATION OF LSTM AND CNN , 2016 .

[5]  Dan Stowell,et al.  Detection and Classification of Acoustic Scenes and Events , 2015, IEEE Transactions on Multimedia.

[6]  Jie Xie,et al.  Investigation of acoustic and visual features for acoustic scene classification , 2019, Expert Syst. Appl..

[7]  Huy Phan,et al.  CNN-LTE: A class of 1-X pooling convolutional neural networks on label tree embeddings for audio scene classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Bambang Heru Iswanto,et al.  Indonesian's Traditional Music Clustering Based on Audio Features , 2017, ICCSCI.

[9]  Goutam Saha,et al.  Classification of audio scenes with novel features in a fused system framework , 2018, Digit. Signal Process..

[10]  Birger Kollmeier,et al.  On the use of spectro-temporal features for the IEEE AASP challenge ‘detection and classification of acoustic scenes and events’ , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[11]  Justin Salamon,et al.  Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[12]  Ricardo Lúcio de Araujo Ribeiro,et al.  Real-Time Power Measurement Using the Maximal Overlap Discrete Wavelet-Packet Transform , 2017, IEEE Transactions on Industrial Electronics.

[13]  Kyogu Lee,et al.  CONVOLUTIONAL NEURAL NETWORK WITH MULTIPLE-WIDTH FREQUENCY-DELTA DATA AUGMENTATION FOR ACOUSTIC SCENE CLASSIFICATION , 2016 .