暂无分享,去创建一个
Simon King | Gustav Eje Henter | Gilles Degottex | Thomas Merritt | G. Henter | G. Degottex | Thomas Merritt | Simon King
[1] Heiga Zen,et al. Directly modeling voiced and unvoiced components in speech waveforms by neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Zhizheng Wu,et al. Improving Trajectory Modelling for DNN-Based Speech Synthesis by Using Stacked Bottleneck Features and Minimum Generation Error Training , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] Hiroshi Ishiguro,et al. Analysis of the Roles and the Dynamics of Breathy and Whispery Voice Qualities in Dialogue Speech , 2010, EURASIP J. Audio Speech Music. Process..
[4] G. Fant. Dept. for Speech, Music and Hearing Quarterly Progress and Status Report the Lf-model Revisited. Transformations and Frequency Domain Analysis the Lf-model Revisited. Transformations and Frequency Domain Analysis* , 2022 .
[5] Simon King,et al. Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Heiga Zen,et al. Hidden Semi-Markov Model Based Speech Synthesis System , 2006 .
[7] Zhizheng Wu,et al. From HMMS to DNNS: Where do the improvements come from? , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[9] Junichi Yamagishi,et al. Initial investigation of speech synthesis based on complex-valued neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11] Heiga Zen,et al. Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences , 2007, Comput. Speech Lang..
[12] Oliver Watts,et al. Letter-based speech synthesis , 2010, SSW.
[13] Kai Yu,et al. Multi-task joint-learning of deep neural networks for robust speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[14] Xin Wang,et al. Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System , 2016, INTERSPEECH.
[15] Li-Rong Dai,et al. Minimum Kullback–Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[17] Tomoki Toda,et al. Implementation of Computationally Efficient Real-Time Voice Conversion , 2012, INTERSPEECH.
[18] Heiga Zen,et al. Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Moncef Gabbouj,et al. Ways to Implement Global Variance in Statistical Speech Synthesis , 2012, INTERSPEECH.
[20] Dong Yu,et al. Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Cassia Valentini-Botinhao,et al. Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Junichi Yamagishi,et al. Multiple feed-forward deep neural networks for statistical parametric speech synthesis , 2015, INTERSPEECH.
[23] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[24] Oliver Watts,et al. Evaluating comprehension of natural and synthetic conversational speech , 2016 .
[25] Qiguang Lin,et al. Glottal source‐vocal tract acoustic interaction , 1987 .
[26] Yannis Stylianou,et al. Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification , 1996 .
[27] Mikko Kurimo,et al. Noise in HMM-Based Speech Synthesis Adaptation: Analysis, Evaluation Methods and Experiments , 2014, IEEE Journal of Selected Topics in Signal Processing.
[28] Keiichi Tokuda,et al. A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks , 2016, INTERSPEECH.
[29] Yoshihiko Nankaku,et al. The effect of neural networks in statistical parametric speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Simon King,et al. Robustness of HMM-based speech synthesis , 2008, INTERSPEECH.
[31] Hugo Larochelle,et al. A Deep and Tractable Density Estimator , 2013, ICML.
[32] William J. Byrne,et al. Fast, low-artifact speech synthesis considering global variance , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[33] Cassia Valentini-Botinhao,et al. Are we using enough listeners? no! - an empirically-supported critique of interspeech 2014 TTS evaluations , 2015, INTERSPEECH.
[34] Takashi Nose,et al. HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation , 2009, IEICE Trans. Inf. Syst..
[35] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .
[36] D. Klatt,et al. Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.
[37] Srikanth Ronanki,et al. Median-based generation of synthetic speech durations using a non-parametric approach , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[38] Heiga Zen,et al. Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Hideki Kawahara,et al. STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds , 2006 .
[40] Yoshihiko Nankaku,et al. Temporal modeling in neural network based statistical parametric speech synthesis , 2016, SSW.
[41] Bhuvana Ramabhadran,et al. Using continuous lexical embeddings to improve symbolic-prosody prediction in a text-to-speech front-end , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Simon King,et al. Measuring the perceptual effects of modelling assumptions in speech synthesis using stimuli constructed from repeated natural speech , 2014, INTERSPEECH.
[43] Heiga Zen,et al. Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] V. Ramamoorthy,et al. Enhancement of ADPCM speech by adaptive postfiltering , 1984, AT&T Bell Laboratories Technical Journal.
[45] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, IEEE International Conference on Acoustics, Speech, and Signal Processing.
[46] Heiga Zen,et al. The Effect of Using Normalized Models in Statistical Speech Synthesis , 2011, INTERSPEECH.
[47] Keiichi Tokuda,et al. Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis , 2005, Systems and Computers in Japan.
[48] H M Hanson,et al. Glottal characteristics of female speakers: acoustic correlates. , 1997, The Journal of the Acoustical Society of America.
[49] John Kane,et al. Improved automatic detection of creak , 2013, Comput. Speech Lang..
[50] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..
[51] Junichi Yamagishi,et al. A perceptual investigation of wavelet-based decomposition of f0 for text-to-speech synthesis , 2015, INTERSPEECH.
[52] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[53] Charles Kemp,et al. How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.
[54] Zhi-Jie Yan,et al. Rich context modeling for high quality HMM-based TTS , 2009, INTERSPEECH.
[55] Takashi Nose,et al. Statistical Parametric Speech Synthesis Based on Gaussian Process Regression , 2014, IEEE Journal of Selected Topics in Signal Processing.
[56] Xia Wang,et al. Improving HMM Based Speech Synthesis by Reducing Over-Smoothing Problems , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.
[57] Yamato Ohtani,et al. Continuous F0 in the source-excitation generation for HMM-based TTS: Do we need voiced/unvoiced classification? , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[58] Vincent Pollet,et al. Synthesis by generation and concatenation of multiform segments , 2008, INTERSPEECH.
[59] Paavo Alku,et al. HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[60] Takashi Nose,et al. Efficient Implementation of Global Variance Compensation for Parametric Speech Synthesis , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[61] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..
[62] Roger K. Moore. A Bayesian explanation of the ‘Uncanny Valley’ effect and related psychological phenomena , 2012, Scientific Reports.
[63] Xin Wang,et al. A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora , 2016, SSW.
[64] Shuang Xu,et al. Gating recurrent mixture density networks for acoustic modeling in statistical parametric speech synthesis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[65] Zhizheng Wu,et al. Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.
[66] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[67] Bajibabu Bollepalli,et al. High-pitched excitation generation for glottal vocoding in statistical parametric speech synthesis using a deep neural network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[68] Kai Yu,et al. Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis , 2011, IEEE Transactions on Audio, Speech, and Language Processing.