Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition
暂无分享,去创建一个
Runnan Li | Sheng Zhao | Jia Jia | Zhiyong Wu | Helen Meng | Jia Jia | H. Meng | Zhiyong Wu | Sheng Zhao | Runnan Li
[1] Dik J. Hermes,et al. Expression of emotion and attitude through temporal speech variations , 2000, INTERSPEECH.
[2] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.
[3] Erik Cambria,et al. Context-Dependent Sentiment Analysis in User-Generated Videos , 2017, ACL.
[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[6] Björn W. Schuller,et al. Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.
[7] Thomas A. Funkhouser,et al. Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Hui Jiang,et al. Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition , 2013, INTERSPEECH.
[9] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[11] George Trigeorgis,et al. The INTERSPEECH 2017 Computational Paralinguistics Challenge: Addressee, Cold & Snoring , 2017, INTERSPEECH.
[12] Marie Tahon,et al. Towards a Small Set of Robust Acoustic Features for Emotion Recognition: Challenges , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Seyedmahdad Mirsamadi,et al. Automatic speech emotion recognition using recurrent neural networks with local attention , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] David M. W. Powers,et al. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.
[16] Wootaek Lim,et al. Speech emotion recognition using convolutional and Recurrent Neural Networks , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).
[17] R. Adolphs,et al. Emotion Perception from Face, Voice, and Touch: Comparisons and Convergence , 2017, Trends in Cognitive Sciences.
[18] Ting Liu,et al. Recent advances in convolutional neural networks , 2015, Pattern Recognit..
[19] George Trigeorgis,et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[21] Erik Cambria,et al. Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[22] Yang Liu,et al. A Multi-Task Learning Framework for Emotion Recognition Using 2D Continuous Space , 2017, IEEE Transactions on Affective Computing.