Speech emotion recognition using recurrent neural networks with directional self-attention

[1]  Carlos Busso,et al.  IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[2]  Hermann Ney,et al.  LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition , 2016, INTERSPEECH.

[3]  Yang Li,et al.  Emotional Tone-Based Audio Continuous Emotion Recognition , 2015, MMM.

[4]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[5]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[6]  Chia-Ping Chen,et al.  Effective Attention Mechanism in Dynamic Models for Speech Emotion Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[8]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[9]  Louis-Philippe Morency,et al.  Representation Learning for Speech Emotion Recognition , 2016, INTERSPEECH.

[10]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[11]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[14]  Koteswara Rao Anne,et al.  A comparative analysis of classifiers in emotion recognition through acoustic features , 2014, Int. J. Speech Technol..

[15]  Efthymios Tzinis,et al.  Segment-based speech emotion recognition using recurrent neural networks , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[16]  Ngoc Thang Vu,et al.  Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Jinkyu Lee,et al.  High-level feature representation using recurrent neural network for speech emotion recognition , 2015, INTERSPEECH.

[18]  S. J. Kabudian,et al.  Speech emotion recognition using discriminative dimension reduction by employing a modified quantum-behaved particle swarm optimization algorithm , 2019, Multimedia Tools and Applications.

[19]  Rafael A. Calvo,et al.  Affect Detection: An Interdisciplinary Review of Models, Methods, and Their Applications , 2010, IEEE Transactions on Affective Computing.

[20]  Ngoc Thang Vu,et al.  Attentive Convolutional Neural Network Based Speech Emotion Recognition: A Study on the Impact of Input Features, Signal Length, and Acted Speech , 2017, INTERSPEECH.

[21]  Wei Jiang,et al.  SphereReID: Deep Hypersphere Manifold Embedding for Person Re-Identification , 2018, J. Vis. Commun. Image Represent..

[22]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[23]  Wu Guo,et al.  An Attention Pooling Based Representation Learning Method for Speech Emotion Recognition , 2018, INTERSPEECH.

[24]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Che-Wei Huang,et al.  Deep convolutional recurrent neural network with attention mechanism for robust speech emotion recognition , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[26]  Yu Zheng,et al.  Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition , 2018, INTERSPEECH.

[27]  Che-Wei Huang,et al.  Attention Assisted Discovery of Sub-Utterance Structure in Speech Emotion Recognition , 2016, INTERSPEECH.

[28]  Seyedmahdad Mirsamadi,et al.  Automatic speech emotion recognition using recurrent neural networks with local attention , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Jing Yang,et al.  3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition , 2018, IEEE Signal Processing Letters.

[30]  Marco Zaffalon,et al.  Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis , 2016, J. Mach. Learn. Res..

[31]  Björn W. Schuller,et al.  Speech emotion recognition , 2018, Commun. ACM.

[32]  Javad Frounchi,et al.  Wavelet-based emotion recognition system using EEG signal , 2017, Neural Computing and Applications.

[33]  Enrique Marcelo Albornoz,et al.  Deep Learning for Emotional Speech Recognition , 2014, MCPR.

[34]  Shi-wook Lee,et al.  The Generalization Effect for Multilingual Speech Emotion Recognition across Heterogeneous Languages , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[36]  Margaret Lech,et al.  Evaluating deep learning architectures for Speech Emotion Recognition , 2017, Neural Networks.

[37]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[38]  Tao Mei,et al.  Boosting image sentiment analysis with visual attention , 2018, Neurocomputing.

[39]  Jiahui Pan,et al.  Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN , 2020, Speech Commun..

[40]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.