Neural Audio Captioning Based on Conditional Sequence-to-Sequence Model
暂无分享,去创建一个
[1] Xavier Serra,et al. Freesound Datasets: A Platform for the Creation of Open Audio Datasets , 2017, ISMIR.
[2] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.
[3] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[5] Daniel P. W. Ellis,et al. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016) , 2016 .
[6] Rob Fergus,et al. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.
[7] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[8] Bryan R. Conroy,et al. Ensemble of feature-based and deep learning-based classifiers for detection of abnormal heart sounds , 2016, 2016 Computing in Cardiology Conference (CinC).
[9] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] David Vandyke,et al. Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.
[11] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[12] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Tobias Watzka,et al. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018) , 2018 .
[14] K. Kashino,et al. Acoustic event search with an onomatopoeic query: measuring distance between onomatopoeic words and sounds , 2018, DCASE.
[15] Mark D. Plumbley,et al. Acoustic Scene Classification: Classifying environments from the sounds they produce , 2014, IEEE Signal Processing Magazine.
[16] Heikki Huttunen,et al. Polyphonic sound event detection using multi label deep neural networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).
[17] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[18] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[19] Yan Song,et al. Robust sound event recognition using convolutional neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Noboru Harada,et al. Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).
[21] Jianfeng Gao,et al. A Persona-Based Neural Conversation Model , 2016, ACL.
[22] Kunio Kashino,et al. Generating Sound Words from Audio Signals of Acoustic Events with Sequence-to-Sequence Model , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).