Prosodic and Spectral iVectors for Expressive Speech Synthesis
暂无分享,去创建一个
[1] Florin Curelaru,et al. Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).
[2] Paula Lopez-Otero,et al. iVectors for Continuous Emotion Recognition , 2014 .
[3] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[4] Paul Boersma,et al. Praat: doing phonetics by computer , 2003 .
[5] R. Gray,et al. Vector quantization , 1984, IEEE ASSP Magazine.
[6] Björn W. Schuller,et al. Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles , 2005, INTERSPEECH.
[7] Albino Nogueiras,et al. Interface Databases: Design and Collection of a Multilingual Emotional Speech Database , 2002, LREC.
[8] Antonio Bonafonte,et al. Creating expressive synthetic voices by unsupervised clustering of audiobooks , 2015, INTERSPEECH.
[9] Simon King,et al. Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech , 2010, Speech Commun..
[10] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[11] George Karypis,et al. Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering , 2004, Machine Learning.
[12] Mark J. F. Gales,et al. Unsupervised clustering of emotion and voice styles for expressive TTS , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Antonio Bonafonte,et al. Ogmios: The UPC Text-to-Speech synthesis system for Spoken Translation , 2006 .
[14] Richard M. Schwartz,et al. A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[15] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[16] Inma Hernáez,et al. Improved HNM-Based Vocoder for Statistical Synthesizers , 2011, INTERSPEECH.
[17] Julie Carson-Berndsen,et al. Clustering Expressive Speech Styles in Audiobooks Using Glottal Source Parameters , 2011, INTERSPEECH.
[18] Patrick Kenny,et al. Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.