论文信息 - A Re-trained Model Based On Multi-kernel Convolutional Neural Network for Acoustic Scene Classification - 字舞流文

A Re-trained Model Based On Multi-kernel Convolutional Neural Network for Acoustic Scene Classification

This paper proposes a deep learning framework applied for Acoustic Scene Classification (ASC), which identifies recording location. In general, we apply three types of spectrograms: Gammatone (GAM), log-Mel and Constant Q Transform (CQT) for front-end feature extraction. For back-end classification, we present a re-trained model with a multi-kernel CDNN-based architecture for the pre-trained process and a DNN-based network for the post-trained process. Our obtained results over DCASE 2016 dataset show a significant improvement, increasing by nearly 8% compared to DCASE baseline of 77.2%.

Lam Pham | Trang Hoang | Tuan Nguyen | Dat Ngo | Linh Tran

[1] J Self. Human and machine learning , 1987 .

[2] Hanseok Ko,et al. Score Fusion of Classification Systems for Acoustic Scene Classification , 2016 .

[3] Goutam Saha,et al. Wavelet Transform Based Mel-scaled Features for Acoustic Scene Classification , 2018, INTERSPEECH.

[4] Soo-Don Hyun,et al. ACOUSTIC SCENE CLASSIFICATION USING PARALLEL COMBINATION OF LSTM AND CNN , 2016 .

[5] John H. L. Hansen,et al. Acoustic Scene Classification Using a CNN-SuperVector System Trained with Auditory and Spectrogram Image Features , 2017, INTERSPEECH.

[6] Thomas Lidy,et al. CQT-based Convolutional Neural Networks for Audio Scene Classification , 2016, DCASE.

[7] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[8] Mark D. Plumbley,et al. Hierarchical Learning for DNN-Based Acoustic Scene Classification , 2016 .

[9] Hongwei Song,et al. A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification , 2018, INTERSPEECH.

[10] Jaehun Kim. EMPIRICAL STUDY ON ENSEMBLE METHOD OF DEEP NEURAL NETWORKS FOR ACOUSTIC SCENE CLASSIFICATION , 2022 .

[11] S. Squartini,et al. DCASE 2016 Acoustic Scene Classification Using Convolutional Neural Networks , 2016, DCASE.

[12] Björn Schuller,et al. THE UP SYSTEM FOR THE 2016 DCASE CHALLENGE USING DEEP RECURRENT NEURAL NETWORK AND MULTISCALE KERNEL SUBSPACE LEARNING , 2016 .

[13] Bhiksha Raj,et al. Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording , 2016, DCASE.

[14] Gerhard Widmer,et al. CP-JKU SUBMISSIONS FOR DCASE-2016 : A HYBRID APPROACH USING BINAURAL I-VECTORS AND DEEP CONVOLUTIONAL NEURAL NETWORKS , 2016 .

[15] Nobutaka Ono,et al. ACOUSTIC SCENE CLASSIFICATION USING DEEP NEURAL NETWORK AND FRAME-CONCATENATED ACOUSTIC FEATURE , 2016 .

[16] Kyogu Lee,et al. CONVOLUTIONAL NEURAL NETWORK WITH MULTIPLE-WIDTH FREQUENCY-DELTA DATA AUGMENTATION FOR ACOUSTIC SCENE CLASSIFICATION , 2016 .

[17] Patrick Pérez,et al. Acoustic Scene Classification: An evaluation of an extremely compact feature representation , 2016, DCASE.

[18] Roger Zimmermann,et al. Learning and Fusing Multimodal Deep Features for Acoustic Scene Categorization , 2018, ACM Multimedia.

[19] Florian Metze,et al. A comparison of Deep Learning methods for environmental sound detection , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20] S. Essid,et al. SUPERVISED NONNEGATIVE MATRIX FACTORIZATION FOR ACOUSTIC SCENE CLASSIFICATION , 2016 .

[21] Huy Phan,et al. Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22] Tuomas Virtanen,et al. TUT database for acoustic scene classification and sound event detection , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[23] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.

[24] Franz Pernkopf,et al. Gated Recurrent Networks applied to Acoustic Scene Classification , 2016, DCASE.