Acoustic feature prediction from semantic features for expressive speech using deep neural networks
暂无分享,去创建一个
[1] Antonio Bonafonte,et al. Creating expressive synthetic voices by unsupervised clustering of audiobooks , 2015, INTERSPEECH.
[2] 24th European Signal Processing Conference, EUSIPCO 2016, Budapest, Hungary, August 29 - September 2, 2016 , 2016, European Signal Processing Conference.
[3] Cecilia Ovesdotter Alm,et al. Emotions from Text: Machine Learning for Text-based Emotion Prediction , 2005, HLT.
[4] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[5] J. Pennebaker,et al. The Secret Life of Pronouns , 2003, Psychological science.
[6] Mark J. F. Gales,et al. Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive Training , 2012, INTERSPEECH.
[7] Julie Carson-Berndsen,et al. Clustering Expressive Speech Styles in Audiobooks Using Glottal Source Parameters , 2011, INTERSPEECH.
[8] Pere Barnola,et al. Conversión de Texto en Habla Multidominio , 2004 .
[9] Mark J. F. Gales,et al. Integrated Expression Prediction and Speech Synthesis From Text , 2014, IEEE Journal of Selected Topics in Signal Processing.
[10] Inma Hernáez,et al. Improved HNM-Based Vocoder for Statistical Synthesizers , 2011, INTERSPEECH.
[11] Björn W. Schuller,et al. Speaker Independent Speech Emotion Recognition by Ensemble Classification , 2005, 2005 IEEE International Conference on Multimedia and Expo.
[12] Mark J. F. Gales,et al. Unsupervised clustering of emotion and voice styles for expressive TTS , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Antonio Bonafonte,et al. Ogmios: The UPC Text-to-Speech synthesis system for Spoken Translation , 2006 .
[14] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[15] Gemma Boleda,et al. Wikicorpus: A Word-Sense Disambiguated Multilingual Wikipedia Corpus , 2010, LREC.
[16] Florin Curelaru,et al. Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).
[17] Paula Lopez-Otero,et al. iVectors for Continuous Emotion Recognition , 2014 .
[18] Björn W. Schuller,et al. Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles , 2005, INTERSPEECH.
[19] Patrick Kenny,et al. Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.
[20] Carlo Strapparava,et al. Learning to identify emotions in text , 2008, SAC '08.
[21] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[22] Oliver Watts,et al. Unsupervised learning for text-to-speech synthesis , 2013 .
[23] L. Lamel,et al. Emotion detection in task-oriented spoken dialogues , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).
[24] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..
[25] Shrikanth S. Narayanan,et al. Combining acoustic and language information for emotion recognition , 2002, INTERSPEECH.
[26] Björn W. Schuller,et al. Recognizing Affect from Linguistic Information in 3D Continuous Space , 2011, IEEE Transactions on Affective Computing.
[27] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..