论文信息 - Speech emotion recognition using recurrent neural networks with directional self-attention - 字舞流文

Speech emotion recognition using recurrent neural networks with directional self-attention

Dongdong Li | Zhe Wang | Linyu Sun | Jinlin Liu | Zhuo Yang | Zhe Wang | Dongdong Li | Jinlin Liu | Zhuo Yang | Linyu Sun

[1] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[2] Hermann Ney,et al. LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition , 2016, INTERSPEECH.

[3] Yang Li,et al. Emotional Tone-Based Audio Continuous Emotion Recognition , 2015, MMM.

[4] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.

[5] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.

[6] Chia-Ping Chen,et al. Effective Attention Mechanism in Dynamic Models for Speech Emotion Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[8] Fakhri Karray,et al. Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[9] Louis-Philippe Morency,et al. Representation Learning for Speech Emotion Recognition , 2016, INTERSPEECH.

[10] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[11] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[12] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[14] Koteswara Rao Anne,et al. A comparative analysis of classifiers in emotion recognition through acoustic features , 2014, Int. J. Speech Technol..

[15] Efthymios Tzinis,et al. Segment-based speech emotion recognition using recurrent neural networks , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[16] Ngoc Thang Vu,et al. Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17] Jinkyu Lee,et al. High-level feature representation using recurrent neural network for speech emotion recognition , 2015, INTERSPEECH.

[18] S. J. Kabudian,et al. Speech emotion recognition using discriminative dimension reduction by employing a modified quantum-behaved particle swarm optimization algorithm , 2019, Multimedia Tools and Applications.

[19] Rafael A. Calvo,et al. Affect Detection: An Interdisciplinary Review of Models, Methods, and Their Applications , 2010, IEEE Transactions on Affective Computing.

[20] Ngoc Thang Vu,et al. Attentive Convolutional Neural Network Based Speech Emotion Recognition: A Study on the Impact of Input Features, Signal Length, and Acted Speech , 2017, INTERSPEECH.

[21] Wei Jiang,et al. SphereReID: Deep Hypersphere Manifold Embedding for Person Re-Identification , 2018, J. Vis. Commun. Image Represent..

[22] Giovanni Soda,et al. Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[23] Wu Guo,et al. An Attention Pooling Based Representation Learning Method for Speech Emotion Recognition , 2018, INTERSPEECH.

[24] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[25] Che-Wei Huang,et al. Deep convolutional recurrent neural network with attention mechanism for robust speech emotion recognition , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[26] Yu Zheng,et al. Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition , 2018, INTERSPEECH.

[27] Che-Wei Huang,et al. Attention Assisted Discovery of Sub-Utterance Structure in Speech Emotion Recognition , 2016, INTERSPEECH.

[28] Seyedmahdad Mirsamadi,et al. Automatic speech emotion recognition using recurrent neural networks with local attention , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29] Jing Yang,et al. 3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition , 2018, IEEE Signal Processing Letters.

[30] Marco Zaffalon,et al. Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis , 2016, J. Mach. Learn. Res..

[31] Björn W. Schuller,et al. Speech emotion recognition , 2018, Commun. ACM.

[32] Javad Frounchi,et al. Wavelet-based emotion recognition system using EEG signal , 2017, Neural Computing and Applications.

[33] Enrique Marcelo Albornoz,et al. Deep Learning for Emotional Speech Recognition , 2014, MCPR.

[34] Shi-wook Lee,et al. The Generalization Effect for Multilingual Speech Emotion Recognition across Heterogeneous Languages , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[36] Margaret Lech,et al. Evaluating deep learning architectures for Speech Emotion Recognition , 2017, Neural Networks.

[37] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[38] Tao Mei,et al. Boosting image sentiment analysis with visual attention , 2018, Neurocomputing.

[39] Jiahui Pan,et al. Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN , 2020, Speech Commun..

[40] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.