Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
暂无分享,去创建一个
[1] Kenneth Ward Church,et al. Speech Emotion Recognition with Multi-Task Learning , 2021, Interspeech.
[2] Karen Livescu,et al. Layer-Wise Analysis of a Self-Supervised Speech Representation Model , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[3] Tatsuya Kawahara,et al. End-to-End Speech Emotion Recognition Combined with Acoustic-to-Word ASR Model , 2020, INTERSPEECH.
[4] Homayoon Beigi,et al. A Transfer Learning Method for Speech Emotion Recognition from Automatic Speech Recognition , 2020, ArXiv.
[5] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[6] Jilt Sebastian,et al. Fusion Techniques for Utterance-Level Emotion Recognition Combining Speech and Transcripts , 2019, INTERSPEECH.
[7] Saurabh Sahu,et al. Multi-Modal Learning for Speech Emotion Recognition: An Analysis and Comparison of ASR Outputs with Ground Truth Transcription , 2019, INTERSPEECH.
[8] Tatsuya Kawahara,et al. Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning , 2019, INTERSPEECH.
[9] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[11] Kyomin Jung,et al. Multimodal Speech Emotion Recognition Using Audio and Text , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[12] Johanna D. Moore,et al. Recognizing emotions in spoken dialogue with hierarchically fused acoustic and lexical features , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[13] Margaret Lech,et al. On the Correlation and Transferability of Features Between Automatic Speech Recognition and Speech Emotion Recognition , 2016, INTERSPEECH.
[14] Johanna D. Moore,et al. Emotion recognition in spontaneous and acted dialogues , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).
[15] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.
[17] Björn W. Schuller,et al. Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[18] Rosalind W. Picard,et al. A computational model for the automatic recognition of affect in speech , 2004 .
[19] Astrid Paeschke,et al. Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements , 2000 .