The USTC System for Blizzard Challenge 2010
暂无分享,去创建一个
Ren-Hua Wang | Ming Lei | Jiang Yuan | Li-Rong Dai | Zhen-Hua Ling | Yu Hu | Cheng-Cheng Wang | Lu Heng
[1] Abeer Alwan,et al. Text to Speech Synthesis: New Paradigms and Advances , 2004 .
[2] Ren-Hua Wang,et al. USTC System for Blizzard Challenge 2006 an Improved HMM-based Speech Synthesis Method , 2006, Blizzard Challenge.
[3] Milos Cernak. Unit Selection Speech Synthesis in Noise , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[4] Chin-Hui Lee,et al. HIDDEN MARKOV MODEL ADAPTATION USING MAXIMUM A POSTERIORI LINEAR REGRESSION , 1999 .
[5] Heng Lu,et al. The USTC and iFlytek Speech Synthesis Systems for Blizzard Challenge 2007 , 2007 .
[6] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[7] Julia Hirschberg,et al. Automatic ToBI prediction and alignment to speed manual labeling of prosody , 2001, Speech Commun..
[8] Zhi-Jie Yan,et al. An HMM trajectory tiling (HTT) approach to high quality TTS , 2010, INTERSPEECH.
[9] Ren-Hua Wang,et al. Minimum Generation Error Training for HMM-Based Speech Synthesis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[10] Toshio Hirai,et al. Using 5 ms segments in concatenative speech synthesis , 2004, SSW.
[11] Koichi Shinoda,et al. Structural MAP speaker adaptation using hierarchical priors , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[12] Li-Rong Dai,et al. Statistical modeling of syllable-level F0 features for HMM-based unit selection speech synthesis , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.
[13] Hideki Kawahara,et al. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT , 2001, MAVEBA.
[14] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.
[15] Alan W. Black,et al. Creating a database of speech in noise for unit selection synthesis , 2004, SSW.
[16] Wu Guo,et al. Minimum generation error criterion for tree-based clustering of context dependent HMMs , 2006, INTERSPEECH.
[17] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[18] Koichi Shinoda,et al. MDL-based context-dependent subword modeling for speech recognition , 2000 .
[19] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Ren-Hua Wang,et al. HMM-Based Hierarchical Unit Selection Combining Kullback-Leibler Divergence with Likelihood Criterion , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[21] Alan W. Black,et al. Issues in building general letter to sound rules , 1998, SSW.
[22] Ren-Hua Wang,et al. Minimum unit selection error training for HMM-based unit selection speech synthesis system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[23] Mark J. F. Gales,et al. Lightly supervised recognition for automatic alignment of large coherent speech recordings , 2010, INTERSPEECH.
[24] Zhigang Cao,et al. Phonetic transcription verification with generalized posterior probability , 2005, INTERSPEECH.
[25] Heiga Zen,et al. Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters , 2010, SSW.
[26] Keiichi Tokuda,et al. Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[27] Huaiyu Zhu. On Information and Sufficiency , 1997 .
[28] Simon King,et al. The Blizzard Challenge 2007 , 2007 .
[29] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[30] Heiga Zen,et al. Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems , 2009, INTERSPEECH.