Thousands of Voices for HMM-Based Speech Synthesis–Analysis and Application of TTS Systems Built on Various ASR Corpora
暂无分享,去创建一个
Simon King | Mikko Kurimo | Oliver Watts | Junichi Yamagishi | Jilei Tian | Yong Guan | Rile Hu | Yi-Jian Wu | Keiichiro Oura | Reima Karhila | John Dines | Keiichi Tokuda | Bela Usabaev | Keiichiro Oura | K. Tokuda | J. Dines | J. Yamagishi | M. Kurimo | O. Watts | Yi-Jian Wu | Yong Guan | Jilei Tian | R. Hu | Reima Karhila | Simon King | Bela Usabaev
[1] YamagishiJunichi,et al. Thousands of voices for HMM-based speech synthesis , 2010 .
[2] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[3] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[4] T. Barnwell. Correlation analysis of subjective and objective measures for speech quality , 1980, ICASSP.
[5] Koichi Shinoda,et al. MDL-based context-dependent subword modeling for speech recognition , 2000 .
[6] Simon King,et al. Analysis of unsupervised and noise-robust speaker-adaptive HMM-based speech synthesis systems toward a unified ASR and TTS framework , 2009 .
[7] Philip C. Woodland,et al. The development of the HTK Broadcast News transcription system: An overview , 2002, Speech Commun..
[8] Simon King,et al. Statistical analysis of the Blizzard Challenge 2007 listening test results , 2007 .
[9] Daniel Erro Eslava. Intra-lingual and cross-lingual voice conversion using harmonic plus stochastic models , 2008 .
[10] Richard M. Schwartz,et al. A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[11] Jonathan G. Fiscus,et al. 1993 Benchmark Tests for the ARPA Spoken Language Program , 1994, HLT.
[12] Elmar Nöth,et al. QMOS - A Robust Visualization Method for Speaker Dependencies with Different Microphones , 2009 .
[13] Simon King,et al. Multisyn: Open-domain unit selection for the Festival speech synthesis system , 2007, Speech Commun..
[14] Tanja Schultz,et al. Globalphone: a multilingual speech and text database developed at karlsruhe university , 2002, INTERSPEECH.
[15] Junichi Yamagishi,et al. An unified and automatic approach of Mandarin HTS system , 2010, SSW.
[16] Heiga Zen,et al. Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Steven E. Stern,et al. Computer Synthesized Speech Technologies: Tools for Aiding Impairment , 2010 .
[18] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[19] Shuichi Itahashi,et al. The design of the newspaper-based Japanese large vocabulary continuous speech recognition corpus , 1998, ICSLP.
[20] Keiichi Tokuda,et al. Speech synthesis using HMMs with dynamic features , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[21] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[22] Simon King,et al. The Blizzard Challenge 2007 , 2007 .
[23] Keiichi Tokuda,et al. Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.
[24] Simon King,et al. The Blizzard Challenge 2009 , 2009 .
[25] J. Langlois,et al. Attractive Faces Are Only Average , 1990 .
[26] Steve Renals,et al. WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[27] Keiichi Tokuda,et al. An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[28] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[29] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[30] Takao Kobayashi,et al. Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training , 2007, IEICE Trans. Inf. Syst..
[31] Keiichi Tokuda,et al. XIMERA: a new TTS from ATR based on corpus-based technologies , 2004, SSW.
[32] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[33] Susan Fitt,et al. Synthesis of regional English using a keyword lexicon , 1999, EUROSPEECH.
[34] Yoshihiko Nankaku,et al. State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis , 2009, INTERSPEECH.
[35] Heiga Zen,et al. Hidden Semi-Markov Model Based Speech Synthesis System , 2006 .
[36] Keiichi Tokuda,et al. Imposture using synthetic speech against speaker verification based on spectrum and pitch , 2000, INTERSPEECH.
[37] Krzysztof Marasek,et al. SPEECON – Speech Databases for Consumer Devices: Database Specification and Validation , 2002, LREC.
[38] Eric Moulines,et al. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..
[39] Simon King,et al. Measuring the Gap Between HMM-Based ASR and TTS , 2010, IEEE Journal of Selected Topics in Signal Processing.
[40] Heiga Zen,et al. Reformulating the HMM as a Trajectory Model , 2004 .
[41] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[42] Simon King,et al. The Blizzard Challenge 2008 , 2008 .
[43] William J. Byrne,et al. Acoustic training from heterogeneous data sources: experiments in Mandarin conversational telephone speech transcription , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[44] Satoshi Nakamura,et al. The ATR Multilingual Speech-to-Speech Translation System , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[45] J. Foote,et al. WSJCAM0: A BRITISH ENGLISH SPEECH CORPUS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 1995 .
[46] Heiga Zen,et al. The HTS-2008 System: Yet Another Evaluation of the Speaker-Adaptive HMM-based Speech Synthesis System in The 2008 Blizzard Challenge , 2008 .
[47] Frank K. Soong,et al. An HMM-Based Mandarin Chinese Text-To-Speech System , 2006, ISCSLP.
[48] Alan W. Black,et al. Optimizing segment label boundaries for statistical speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[49] Junichi Yamagishi,et al. Revisiting the security of speaker verification systems against imposture using synthetic speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[50] Jonathan G. Fiscus,et al. DARPA Resource Management Benchmark Test Results June 1990 , 1990, HLT.
[51] A. Gray,et al. Distance measures for speech processing , 1976 .
[52] Simon King,et al. Robustness of HMM-based speech synthesis , 2008, INTERSPEECH.
[53] Keiichi Tokuda,et al. On the security of HMM-based speaker verification systems against imposture using synthetic speech , 1999, EUROSPEECH.
[54] Mark J. F. Gales. Adaptive training for robust ASR , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..
[55] Mark J. F. Gales,et al. The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..
[56] Javier Macias-Guarasa,et al. Generacion de una voz sintetica en Castellano basada en HSMM para la Evaluacion Albayzin 2008: conversion texto a voz , 2008 .
[57] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[58] Stavros Tsakalidis,et al. Cross-Corpus Normalization Of Diverse Acoustic Training Data for Robust HMM Training , 2005 .
[59] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[60] Takao Kobayashi,et al. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[61] Phil D. Green,et al. Building personalised synthesised voices for individuals with dysarthia using the HTS toolkit , 2010 .
[62] Heiga Zen,et al. Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005 , 2007, IEICE Trans. Inf. Syst..
[63] Makoto Shozakai,et al. Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models , 2004, INTERSPEECH.