Speech Emotion Recognition with Heterogeneous Feature Unification of Deep Neural Network
暂无分享,去创建一个
Zheng Wang | Wei Jiang | Jesse S Jin | Xianfeng Han | Chunguang Li | Jesse S. Jin | Chunguang Li | Wei Jiang | Zheng Wang | Xianfeng Han
[1] Erik Cambria,et al. Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[2] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[3] Markus Kächele,et al. Multiple Classifier Systems for the Classification of Audio-Visual Emotional States , 2011, ACII.
[4] Björn W. Schuller,et al. Recognizing Affect from Linguistic Information in 3D Continuous Space , 2011, IEEE Transactions on Affective Computing.
[5] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Stefan Wermter,et al. Reusing Neural Speech Representations for Auditory Emotion Recognition , 2017, IJCNLP.
[7] Zhong-Qiu Wang,et al. Learning utterance-level representations for speech emotion and age/gender recognition using deep neural networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Nilanjan Ray,et al. Object Detection With DoG Scale-Space: A Multiple Kernel Learning Approach , 2012, IEEE Transactions on Image Processing.
[9] Ling Guan,et al. Kernel Cross-Modal Factor Analysis for Information Fusion With Application to Bimodal Emotion Recognition , 2012, IEEE Transactions on Multimedia.
[10] Björn W. Schuller,et al. The INTERSPEECH 2010 paralinguistic challenge , 2010, INTERSPEECH.
[11] Cigdem Eroglu Erdem,et al. BAUM-1: A Spontaneous Audio-Visual Face Database of Affective and Mental States , 2017, IEEE Transactions on Affective Computing.
[12] Mark A Gregory,et al. A novel approach for MFCC feature extraction , 2010, 2010 4th International Conference on Signal Processing and Communication Systems.
[13] Chengxin Li,et al. Speech emotion recognition with acoustic and lexical features , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Alessandro Moschitti,et al. Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.
[15] Marie Tahon,et al. Towards a Small Set of Robust Acoustic Features for Emotion Recognition: Challenges , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[17] Timothy F. Cootes,et al. Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[18] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.
[19] Pavel Matejka,et al. Multimodal Emotion Recognition for AVEC 2016 Challenge , 2016, AVEC@ACM Multimedia.
[20] Yongming Huang,et al. Adaptive Wavelet Packet Filter-Bank Based Acoustic Feature for Speech Emotion Recognition , 2013 .
[21] Ivan Marsic,et al. Deep Mul Timodal Learning for Emotion Recognition in Spoken Language , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Richard J Davidson,et al. Test-retest reliability of voluntary emotion regulation. , 2009, Psychophysiology.
[23] Paul D. Gader,et al. Model level fusion of edge histogram descriptors and gabor wavelets for landmine detection with ground penetrating radar , 2010, 2010 IEEE International Geoscience and Remote Sensing Symposium.
[24] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[25] Verónica Pérez-Rosas,et al. Multimodal Sentiment Analysis of Spanish Online Videos , 2013, IEEE Intelligent Systems.
[26] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.
[27] Björn W. Schuller,et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing , 2016, IEEE Transactions on Affective Computing.
[28] Elisabeth André,et al. Emotion-specific dichotomous classification and feature-level fusion of multichannel biosignals for automatic emotion recognition , 2008, 2008 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.
[29] Nasrollah Moghaddam Charkari,et al. Multimodal information fusion application to human emotion recognition from face and speech , 2010, Multimedia Tools and Applications.
[30] Ling Guan,et al. Recognizing Human Emotional State From Audiovisual Signals , 2008, IEEE Transactions on Multimedia.
[31] Ethem Alpaydin,et al. Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..
[32] Tanaya Guha,et al. Multimodal Prediction of Affective Dimensions and Depression in Human-Computer Interactions , 2014, AVEC '14.
[33] Qinghua Hu,et al. SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection , 2018, IEEE Transactions on Cybernetics.
[34] Wolfgang Menzel,et al. An architecture for incremental information fusion of cross-modal representations , 2012, 2012 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI).
[35] Guoyong Cai,et al. Convolutional Neural Networks for Multimedia Sentiment Analysis , 2015, NLPCC.
[36] Yi-Ping Phoebe Chen,et al. Acoustic Features Extraction for Emotion Recognition , 2007, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007).
[37] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[38] Wen Gao,et al. Learning Affective Features With a Hybrid Deep Model for Audio–Visual Emotion Recognition , 2018, IEEE Transactions on Circuits and Systems for Video Technology.
[39] Huang Jian,et al. Multimodal Emotion Recognition with Transfer Learning of Deep Neural Network , 2020 .
[40] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[41] Han Wen. Review on Speech Emotion Recognition , 2014 .
[42] Qi Tian,et al. HMM-Based Audio Keyword Generation , 2004, PCM.
[43] M. Shamim Hossain,et al. Audio–Visual Emotion-Aware Cloud Gaming Framework , 2015, IEEE Transactions on Circuits and Systems for Video Technology.
[44] Björn W. Schuller,et al. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..
[45] Qinghua Hu,et al. Heterogeneous Feature Selection With Multi-Modal Deep Neural Networks and Sparse Group LASSO , 2015, IEEE Transactions on Multimedia.
[46] I. Christie,et al. Autonomic specificity of discrete emotion and dimensions of affective space: a multivariate approach. , 2004, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.
[47] Björn Schuller,et al. Recognizing Emotions From Whispered Speech Based on Acoustic Feature Transfer Learning , 2017, IEEE Access.
[48] Björn W. Schuller,et al. Feature selection in multimodal continuous emotion prediction , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW).
[49] Byung Cheol Song,et al. Multi-modal emotion recognition using semi-supervised learning and multiple neural networks in the wild , 2017, ICMI.
[50] Ivan Marsic,et al. Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment , 2018, ACL.