PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
暂无分享,去创建一个
[1] Ankit Shah,et al. DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System , 2017, DCASE.
[2] Daniel P. W. Ellis,et al. Detecting Alarm Sounds , 2001 .
[3] Jae Hoon Ko,et al. Classification of snoring sound based on a recurrent neural network , 2019, Expert Syst. Appl..
[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Tuomas Virtanen,et al. TUT database for acoustic scene classification and sound event detection , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).
[6] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[7] Ishwar K. Sethi,et al. Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..
[8] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[9] Tuomas Virtanen,et al. Transfer learning of weakly labelled audio , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[10] Justin Salamon,et al. A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.
[11] Dan Stowell,et al. Detection and Classification of Acoustic Scenes and Events , 2015, IEEE Transactions on Multimedia.
[12] James R. Glass,et al. A Deep Residual Network for Large-Scale Acoustic Scene Analysis , 2019, INTERSPEECH.
[13] Daniel P. W. Ellis,et al. General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline , 2018, DCASE.
[14] Jeffrey P. Woodard,et al. Modeling and classification of natural sounds by product code hidden Markov models , 1992, IEEE Trans. Signal Process..
[15] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[16] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[17] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Qiang Chen,et al. Network In Network , 2013, ICLR.
[20] Yong Xu,et al. Cross-task learning for audio tagging, sound event detection and spatial localization: DCASE 2019 baseline systems , 2019, ArXiv.
[21] Xavier Serra,et al. musicnn: Pre-trained convolutional neural networks for music audio tagging , 2019, ArXiv.
[22] Tuomas Virtanen,et al. Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.
[23] Edith Law,et al. Input-agreement: a new mechanism for collecting data using human computation games , 2009, CHI.
[24] Udit Gupta,et al. ATTENTION-BASED CONVOLUTIONAL NEURAL NETWORK FOR AUDIO EVENT CLASSIFICATION WITH FEATURE TRANSFER LEARNING , 2018 .
[25] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..
[26] Yong Xu,et al. Audio Set Classification with Attention Model: A Probabilistic Perspective , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Benjamin Schrauwen,et al. Transfer Learning by Supervised Pre-training for Audio-based Music Classification , 2014, ISMIR.
[28] VirtanenTuomas,et al. Detection and Classification of Acoustic Scenes and Events , 2018 .
[29] Karol J. Piczak. ESC: Dataset for Environmental Sound Classification , 2015, ACM Multimedia.
[30] Mark D. Plumbley,et al. Weakly Labelled AudioSet Tagging With Attention Neural Networks , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[31] Bin Yang,et al. Multi-level attention model for weakly supervised audio classification , 2018, DCASE.
[32] Tuomas Virtanen,et al. A multi-device dataset for urban acoustic scene classification , 2018, DCASE.
[33] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[34] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[35] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[36] Yun Wang. Polyphonic Sound Event Detection with Weak Labeling , 2017 .
[37] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[38] William J. Davies,et al. Generalisation in Environmental Sound Classification: The ‘Making Sense of Sounds’ Data Set and Challenge , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Mark B. Sandler,et al. Automatic Tagging Using Deep Convolutional Neural Networks , 2016, ISMIR.
[40] Juhan Nam,et al. Sample-level Deep Convolutional Neural Networks for Music Auto-tagging Using Raw Waveforms , 2017, ArXiv.
[41] P. Karsmakers,et al. AN MFCC-GMM APPROACH FOR EVENT DETECTION AND CLASSIFICATION , 2013 .
[42] Heikki Huttunen,et al. Polyphonic sound event detection using multi label deep neural networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).
[43] Yi-Hsuan Yang,et al. Learning to Recognize Transient Sound Events using Attentional Supervision , 2018, IJCAI.
[44] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Mark Sandler,et al. Transfer Learning for Music Classification and Regression Tasks , 2017, ISMIR.
[46] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[47] M. Kathleen Pichora-Fuller,et al. Recognition of emotional speech for younger and older talkers: Behavioural findings from the toronto emotional speech set , 2011 .
[48] Florian Metze,et al. A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Xavier Serra,et al. End-to-end Learning for Music Audio Tagging at Scale , 2017, ISMIR.
[50] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[51] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[52] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[53] Yonghong Yan,et al. Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling , 2019, ArXiv.
[54] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[55] Mathieu Lagrange,et al. Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[56] Zhang Yi,et al. Spectrogram based multi-task audio classification , 2017, Multimedia Tools and Applications.
[57] Hemant A. Patil,et al. Unsupervised Filterbank Learning Using Convolutional Restricted Boltzmann Machine for Environmental Sound Classification , 2017, INTERSPEECH.
[58] Buket D. Barkana,et al. NON-SPEECH ENVIRONMENTAL SOUND CLASSIFICATION USING SVMS WITH A NEW SET OF FEATURES , 2012 .
[59] Shenglan Liu,et al. Bottom-up broadcast neural network for music genre classification , 2019, Multimedia Tools and Applications.