Learning deep multimodal affective features for spontaneous speech emotion recognition

[1]  Xiaoming Zhao,et al.  Spontaneous Speech Emotion Recognition Using Multiscale Deep Convolutional LSTM , 2022, IEEE Transactions on Affective Computing.

[2]  Li Liu,et al.  Wavelet packet analysis for speaker-independent emotion recognition , 2020, Neurocomputing.

[3]  Zhao Ren,et al.  Exploring Deep Spectrum Representations via Attention-Based Recurrent and Convolutional Neural Networks for Speech Emotion Recognition , 2019, IEEE Access.

[4]  Jing Han,et al.  Compact Convolutional Recurrent Neural Networks via Binarization for Speech Emotion Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  peng song,et al.  Transfer Linear Subspace Learning for Cross-Corpus Speech Emotion Recognition , 2019, IEEE Transactions on Affective Computing.

[6]  Zhiyuan Li,et al.  Feature-Level and Model-Level Audiovisual Fusion for Emotion Recognition in the Wild , 2019, 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).

[7]  Ping Lu,et al.  Audio-visual emotion fusion (AVEF): A deep efficient weighted approach , 2019, Inf. Fusion.

[8]  Jun Du,et al.  Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-video Emotion Recognition , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[9]  Heye Zhang,et al.  IoT-based 3D convolution for video salient object detection , 2019, Neural Computing and Applications.

[10]  Haishuai Wang,et al.  Deep Spectrum Feature Representations for Speech Emotion Recognition , 2018, Proceedings of the Joint Workshop of the 4th Workshop on Affective Social Multimedia Computing and first Multi-Modal Affective Computing of Large-Scale Multimedia Data.

[11]  Wen Gao,et al.  Learning Affective Features With a Hybrid Deep Model for Audio–Visual Emotion Recognition , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Wen Gao,et al.  Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching , 2018, IEEE Transactions on Multimedia.

[13]  Björn W. Schuller,et al.  Speech emotion recognition , 2018, Commun. ACM.

[14]  Juhan Nam,et al.  SampleCNN: End-to-End Deep Convolutional Neural Networks Using Very Small Filters for Music Classification , 2018 .

[15]  Jun-Wei Mao,et al.  Speech emotion recognition based on feature selection and extreme learning machine decision tree , 2018, Neurocomputing.

[16]  Sung Wook Baik,et al.  Deep features-based speech emotion recognition for smart affective services , 2017, Multimedia Tools and Applications.

[17]  Juhan Nam,et al.  Sample-Level CNN Architectures for Music Auto-Tagging Using Raw Waveforms , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Zhiyuan Li,et al.  Island Loss for Learning Discriminative Features in Facial Expression Recognition , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[19]  Jian Wang,et al.  Deep Metric Learning with Angular Loss , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Haytham M. Fayek,et al.  Evaluating deep learning architectures for Speech Emotion Recognition , 2017, Neural Networks.

[21]  Che-Wei Huang,et al.  Deep convolutional recurrent neural network with attention mechanism for robust speech emotion recognition , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[22]  Cigdem Eroglu Erdem,et al.  BAUM-1: A Spontaneous Audio-Visual Face Database of Affective and Mental States , 2017, IEEE Transactions on Affective Computing.

[23]  Zhong-Qiu Wang,et al.  Learning utterance-level representations for speech emotion and age/gender recognition using deep neural networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[25]  Xavier Giró-i-Nieto,et al.  From pixels to sentiment: Fine-tuning CNNs for visual sentiment prediction , 2016, Image Vis. Comput..

[26]  Björn W. Schuller,et al.  The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing , 2016, IEEE Transactions on Affective Computing.

[27]  George Trigeorgis,et al.  Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Hongbin Zha,et al.  Multiple Models Fusion for Emotion Recognition in the Wild , 2015, ICMI.

[30]  Tamás D. Gedeon,et al.  Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015 , 2015, ICMI.

[31]  Cigdem Eroglu Erdem,et al.  Affect Recognition using Key Frame Selection based on Minimum Sparse Reconstruction , 2015, ICMI.

[32]  Christopher Joseph Pal,et al.  Recurrent Neural Networks for Emotion Recognition in Video , 2015, ICMI.

[33]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[34]  Jian Sun,et al.  Object Detection Networks on Convolutional Feature Maps , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Shiqing Zhang,et al.  Spoken emotion recognition via locality-constrained kernel sparse representation , 2015, Neural Computing and Applications.

[36]  Theodoros Iliou,et al.  Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011 , 2012, Artificial Intelligence Review.

[37]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Yongzhao Zhan,et al.  Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks , 2014, IEEE Transactions on Multimedia.

[39]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40]  Dong Yu,et al.  Speech emotion recognition using deep neural network and extreme learning machine , 2014, INTERSPEECH.

[41]  Björn W. Schuller,et al.  AVEC 2013: the continuous audio/visual emotion and depression recognition challenge , 2013, AVEC@ACM Multimedia.

[42]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[43]  M. Sheikhan,et al.  Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network , 2012, Neural Computing and Applications.

[44]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[45]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[46]  Kaya Oguz,et al.  Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers , 2020, Speech Commun..

[47]  Emily Mower Provost,et al.  Cross-Corpus Acoustic Emotion Recognition with Multi-Task Learning: Seeking Common Ground While Preserving Differences , 2019, IEEE Transactions on Affective Computing.

[48]  Semiye Demircan,et al.  Application of fuzzy C-means clustering algorithm to spectral features for emotion classification from speech , 2016, Neural Computing and Applications.

[49]  Eduardo Coutinho,et al.  Cooperative Learning and its Application to Emotion Recognition from Speech , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.