论文信息 - Learning affective representations based on magnitude and dynamic relative phase information for speech emotion recognition - 字舞流文

Learning affective representations based on magnitude and dynamic relative phase information for speech emotion recognition

Jianwu Dang | Chng Eng Siong | Seiichi Nakagawa | Lili Guo | Longbiao Wang | J. Dang | Longbiao Wang | S. Nakagawa | Lili Guo

[1] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[2] Bayya Yegnanarayana,et al. Waveform estimation using group delay processing , 1985, IEEE Trans. Acoust. Speech Signal Process..

[3] Bayya Yegnanarayana,et al. Significance of group delay functions in spectrum estimation , 1992, IEEE Trans. Signal Process..

[4] Hongming Zhou,et al. Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5] Yang Liu,et al. A Multi-Task Learning Framework for Emotion Recognition Using 2D Continuous Space , 2017, IEEE Transactions on Affective Computing.

[6] Longbiao Wang,et al. Spoofing Speech Detection Using Modified Relative Phase Information , 2017, IEEE Journal of Selected Topics in Signal Processing.

[7] Seyedmahdad Mirsamadi,et al. Automatic speech emotion recognition using recurrent neural networks with local attention , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.

[9] A. Kai Qin,et al. Evolutionary extreme learning machine , 2005, Pattern Recognit..

[10] Wootaek Lim,et al. Speech emotion recognition using convolutional and Recurrent Neural Networks , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[11] Christopher Joseph Pal,et al. Recurrent Neural Networks for Emotion Recognition in Video , 2015, ICMI.

[12] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[13] Hema A. Murthy,et al. The modified group delay function and its application to phoneme recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[14] Jonathan Le Roux,et al. Phase Processing for Single-Channel Speech Enhancement: History and recent advances , 2015, IEEE Signal Processing Magazine.

[15] Fakhri Karray,et al. Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[16] Longbiao Wang,et al. Speaker Identification and Verification by Combining MFCC and Phase Information , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[17] Sridha Sridharan,et al. The Delta-Phase Spectrum With Application to Voice Activity Detection and Speaker Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[18] Longbiao Wang,et al. Speaker-Aware Speech Emotion Recognition by Fusing Amplitude and Phase Information , 2020, MMM.

[19] I. Saratxaga,et al. Simple representation of signal phase for harmonic speech models , 2009 .

[20] Shashidhar G. Koolagudi,et al. Emotion recognition from speech: a review , 2012, International Journal of Speech Technology.

[21] Björn W. Schuller,et al. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[22] Rajesh M. Hegde,et al. Significance of the Modified Group Delay Feature in Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[23] Emery Schubert. Measuring emotion continuously: Validity and reliability of the two-dimensional emotion-space , 1999 .

[24] Jacek M. Zurada,et al. Review and performance comparison of SVM- and ELM-based classifiers , 2014, Neurocomputing.

[25] Éric Gaussier,et al. A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation , 2005, ECIR.

[26] Jianwu Dang,et al. Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition , 2018, ICONIP.

[27] Ibon Saratxaga,et al. Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech , 2012, IEEE Transactions on Audio, Speech, and Language Processing.