Statistical Parametric Speech Synthesis

[1]  Keiichi Tokuda,et al.  Full covariance state duration modeling for HMM-based speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Heiga Zen,et al.  A Bayesian approach to HMM-based speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Zhizheng Wu,et al.  Improved prosody generation by maximizing joint likelihood of state and longer units , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Keiichi Tokuda,et al.  Minimum generation error training by using original spectrum as reference for log spectral distortion measure , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Feng Ding,et al.  A polynomial segment model based statistical parametric speech synthesis sytem , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Tomoki Toda,et al.  Probablistic modelling of F0 in unvoiced regions in HMM based speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Thierry Dutoit,et al.  Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Tomoki Toda,et al.  Trajectory training considering global variance for HMM-based speech synthesis , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Takashi Nose,et al.  HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation , 2009, IEICE Trans. Inf. Syst..

[10]  Takao Kobayashi,et al.  Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Le Zhang,et al.  Modelling Speech Dynamics with Trajectory-HMMs , 2009 .

[12]  Li-Rong Dai,et al.  Multi-Layer F0 Modeling for HMM-Based Speech Synthesis , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[13]  Wei Zhang,et al.  Cross-Stream Dependency Modeling for HMM-Based Speech Synthesis , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[14]  Frank K. Soong,et al.  HMM-Based Mixed-Language (Mandarin-English) Speech Synthesis , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[15]  Yoshihiko Nankaku,et al.  Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[16]  Yi-Jian Wu,et al.  Analysis of stream-dependent tying structure for HMM-based speech synthesis , 2008, 2008 9th International Conference on Signal Processing.

[17]  Aimilios Chalamandaris,et al.  HMM-Based Speech Synthesis for the Greek Language , 2008, TSD.

[18]  Simon King,et al.  Robustness of HMM-based speech synthesis , 2008, INTERSPEECH.

[19]  Junichi Yamagishi,et al.  Combining Statistical Parameteric Speech Synthesis and Unit-Selection for Automatic Voice Cloning , 2008 .

[20]  Heiga Zen,et al.  The HTS-2008 System: Yet Another Evaluation of the Speaker-Adaptive HMM-based Speech Synthesis System in The 2008 Blizzard Challenge , 2008 .

[21]  Heiga Zen,et al.  Unsupervised adaptation for HMM-based speech synthesis , 2008, INTERSPEECH.

[22]  Junichi Yamagishi,et al.  Glottal spectral separation for parametric speech synthesis , 2008, INTERSPEECH.

[23]  Ren-Hua Wang,et al.  Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge , 2008, INTERSPEECH.

[24]  Awad H. Khalil,et al.  Usage of the HMM-Based Speech Synthesis for Intelligent Arabic Voice , 2008, Computers and Their Applications.

[25]  Heiga Zen,et al.  The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006 , 2006, IEICE Trans. Inf. Syst..

[26]  Keiichi Tokuda,et al.  Statistical approach to vocal tract transfer function estimation based on factor analyzed trajectory HMM , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Li-Rong Dai,et al.  Minimum generation error criterion considering global/local variance for HMM-based speech synthesis , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  Ren-Hua Wang,et al.  Minimum unit selection error training for HMM-based unit selection speech synthesis system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29]  Frank K. Soong,et al.  A cross-language state mapping approach to bilingual (Mandarin-English) TTS , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Ren-Hua Wang,et al.  Minumum generation error linear regression based model adaptation for HMM-based speech synthesis , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Takashi Nose,et al.  Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Heiga Zen,et al.  Performance evaluation of the speaker-independent HMM-based speech synthesis system “HTS 2007” for the Blizzard Challenge 2007 , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[33]  Keiichi Tokuda,et al.  On the state definition for a trainable excitation model in HMM-based speech synthesis , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[34]  Jordi Adell,et al.  Corpus and Voices for Catalan Speech Synthesis , 2008, LREC.

[35]  A. Bonafonte,et al.  FLEXIBLE HARMONIC / STOCHASTIC MODELING FOR HMM-BASED SPEECH SYNTHESIS , 2008 .

[36]  S. Sakti,et al.  Development of HMM-based Indonesian Speech Synthesis , 2008 .

[37]  Uden Sherpa,et al.  Pioneering Dzongkha Text-to-Speech Synthesis , 2008 .

[38]  Vincent Pollet,et al.  Synthesis by generation and concatenation of multiform segments , 2008, INTERSPEECH.

[39]  Moncef Gabbouj,et al.  Evaluation of Finnish unit selection and HMM-based speech synthesis , 2008, INTERSPEECH.

[40]  Frank K. Soong,et al.  Generating natural F0 trajectory with additive trees , 2008, INTERSPEECH.

[41]  Heiga Zen,et al.  Probabilistic feature mapping based on trajectory HMMs , 2008, INTERSPEECH.

[42]  Masami Akamine,et al.  Multilevel parametric-base F0 model for speech synthesis , 2008, INTERSPEECH.

[43]  Zhizheng Wu,et al.  Duration refinement by jointly optimizing state and longer unit likelihood , 2008, INTERSPEECH.

[44]  Sabine Buchholz,et al.  Comparing QMT1 and HMMs for the synthesis of American English prosody , 2008 .

[45]  Keiichi Tokuda,et al.  Minimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis , 2008, INTERSPEECH.

[46]  David Malah,et al.  Statistical text-to-speech synthesis with improved dynamics , 2008, INTERSPEECH.

[47]  Lirong Dai,et al.  MINIMUM GENERATION ERROR LINEAR REGRESSION BASED MODEL ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS , 2008 .

[48]  Paavo Alku,et al.  HMM-based Finnish text-to-speech system utilizing glottal inverse filtering , 2008, INTERSPEECH.

[49]  T. Dutoit,et al.  On the use of Machine Learning in Statistical Parametric Speech Synthesis , 2008 .

[50]  Simon King,et al.  The Blizzard Challenge 2008 , 2008 .

[51]  Tomoki Toda,et al.  Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[52]  Takashi Nose,et al.  A Style Control Technique for HMM-Based Expressive Speech Synthesis , 2007, IEICE Trans. Inf. Syst..

[53]  Simon King,et al.  Statistical analysis of the Blizzard Challenge 2007 listening test results , 2007 .

[54]  Heiga Zen,et al.  Speaker-Independent HMM-based Speech Synthesis System: HTS-2007 System for the Blizzard Challenge 2007 , 2007 .

[55]  Heiga Zen,et al.  An excitation model for HMM-based speech synthesis based on residual modeling , 2007, SSW.

[56]  Heiga Zen,et al.  Hidden Semi-Markov Model Based Speech Synthesis System , 2006 .

[57]  Ren-Hua Wang,et al.  HMM-Based Hierarchical Unit Selection Combining Kullback-Leibler Divergence with Likelihood Criterion , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[58]  Sadaoki Furui,et al.  Combining Gaussian Mixture Model with Global Variance Term to Improve the Quality of an HMM-Based Polyglot Speech Synthesizer , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[59]  Xia Wang,et al.  A Novel HMM-Based TTS System using Both Continuous HMMS and Discrete HMMS , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[60]  Sumio Watanabe,et al.  Almost All Learning Machines are Singular , 2007, 2007 IEEE Symposium on Foundations of Computational Intelligence.

[61]  Takao Kobayashi,et al.  Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training , 2007, IEICE Trans. Inf. Syst..

[62]  HMM-based Spanish speech synthesis using CBR as F 0 estimator , 2007 .

[63]  Heiga Zen,et al.  Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences , 2007, Comput. Speech Lang..

[64]  Heiga Zen,et al.  Model-space MLLR for trajectory HMMs , 2007, INTERSPEECH.

[65]  Minsoo Hahn,et al.  Two-Band Excitation for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..

[66]  Takao Kobayashi,et al.  Implementation and evaluation of an HMM-based Thai speech synthesis system , 2007, INTERSPEECH.

[67]  Heiga Zen,et al.  Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005 , 2007, IEICE Trans. Inf. Syst..

[68]  Takashi Nose,et al.  Style estimation of speech based on multiple regression hidden semi-Markov model , 2007, INTERSPEECH.

[69]  Heng Lu,et al.  The USTC and iFlytek Speech Synthesis Systems for Blizzard Challenge 2007 , 2007 .

[70]  Sacha Krstulovic,et al.  An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements , 2007, INTERSPEECH.

[71]  Joan Claudi Socoró,et al.  Linguistic and mixed excitation improvements on a HMM-based speech synthesis for Castilian Spanish , 2007, SSW.

[72]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[73]  Simon King,et al.  Speech Recognition Using Linear Dynamic Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[74]  Heiga Zen,et al.  The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.

[75]  Junichi Yamagishi,et al.  Towards an improved modeling of the glottal source in statistical parametric speech synthesis , 2007, SSW.

[76]  Frank K. Soong,et al.  An HMM-Based Mandarin Chinese Text-To-Speech System , 2006, ISCSLP.

[77]  Jong-Jin Kim,et al.  HMM-based Korean speech synthesis system for hand-held devices , 2006, IEEE Transactions on Consumer Electronics.

[78]  Tetsunori Kobayashi,et al.  Hybrid Voice Conversion of Unit Selection and Generation Using Prosody Dependent HMM , 2006, IEICE Trans. Inf. Syst..

[79]  Sadaoki Furui,et al.  New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer , 2006, Speech Commun..

[80]  Dong Yu,et al.  Structured speech modeling , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[81]  Ren-Hua Wang,et al.  Minimum Generation Error Training for HMM-Based Speech Synthesis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[82]  Yoshihiko Nankaku,et al.  On the Use of Phonetic Information for Mapping from Articulatory Movements to Vocal Tract Spectrum , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[83]  Jong-Jin Kim,et al.  Implementation and Evaluation of an HMM-Based Korean Speech Synthesis System , 2006, IEICE Trans. Inf. Syst..

[84]  Takao Kobayashi,et al.  A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features , 2006, IEICE Trans. Inf. Syst..

[85]  Paul Taylor Unifying unit selection and hidden Markov model speech synthesis , 2006, INTERSPEECH.

[86]  Heiga Zen,et al.  Speaker adaptation of trajectory HMMs using feature-space MLLR , 2006, INTERSPEECH.

[87]  Coralie Hemptinne Master Thesis: Integration of the Harmonic plus Noise Model (HNM) into the Hidden Markov Model-Based Speech Synthesis System (HTS) , 2006 .

[88]  Wu Yi-jian HMM-based Trainable Speech Synthesis for Chinese , 2006 .

[89]  Kazumasa Yamamoto,et al.  Mel-LSP Parameterization for HMM-based Speech Synthesis , 2006 .

[90]  Wu Guo,et al.  Minimum generation error criterion for tree-based clustering of context dependent HMMs , 2006, INTERSPEECH.

[91]  Takao Kobayashi,et al.  Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis , 2006, INTERSPEECH.

[92]  Sherif Abdou,et al.  Improving Arabic HMM based speech synthesis quality , 2006, INTERSPEECH.

[93]  Teknillinen Korkeakoulu,et al.  Auditory quality evaluation of present Finnish text-to-speech systems , 2006 .

[94]  Junichi Yamagishi,et al.  Average-Voice-Based Speech Synthesis , 2006 .

[95]  Ivo Ipsic,et al.  Croatian HMM based speech synthesis , 2006, 28th International Conference on Information Technology Interfaces, 2006..

[96]  Ren-Hua Wang,et al.  Improving the performance of HMM-based voice conversion using context clustering decision tree and appropriate regression matrix format , 2006, INTERSPEECH.

[97]  Alan W. Black,et al.  The Blizzard Challenge 2006 CMU Entry introducing hybrid trajectory-selection synthesis , 2006 .

[98]  Yuan Jiang,et al.  Multi-tier Non-uniform Unit Selection for Corpus-based Speech Synthesis , 2006 .

[99]  Ren-Hua Wang,et al.  USTC System for Blizzard Challenge 2006 an Improved HMM-based Speech Synthesis Method , 2006, Blizzard Challenge.

[100]  Alan W. Black,et al.  The Blizzard Challenge 2006 , 2006 .

[101]  Alan W. Black,et al.  CLUSTERGEN: a statistical parametric synthesizer using trajectory modeling , 2006, INTERSPEECH.

[102]  Keiichi Tokuda,et al.  A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..

[103]  Takao Kobayashi,et al.  Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing , 2005, IEICE Trans. Inf. Syst..

[104]  Keikichi Hirose,et al.  Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis , 2005, Speech Commun..

[105]  Sadaoki Furui,et al.  Polyglot synthesis using a mixture of monolingual corpora , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[106]  Masatsune Tamura,et al.  Scalable concatenative speech synthesis based on the plural unit selection and fusion method , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[107]  Soufiane Rouibia,et al.  Unit selection for speech synthesis based on a new acoustic target cost , 2005, INTERSPEECH.

[108]  Keiichi Tokuda,et al.  HMM-based european Portuguese TTS system , 2005, INTERSPEECH.

[109]  Keiichi Tokuda,et al.  The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets , 2005, INTERSPEECH.

[110]  Shinsuke Sakai,et al.  A probabilistic approach to unit selection for corpus-based speech synthesis , 2005, INTERSPEECH.

[111]  Christina L. Bennett Large scale evaluation of corpus-based synthesizers: results and lessons from the blizzard challenge 2005 , 2005, INTERSPEECH.

[112]  France Mihelic,et al.  Evaluation of the Slovenian HMM-Based Speech Synthesis System , 2004, TSD.

[113]  Simon King,et al.  Articulatory feature recognition using dynamic Bayesian networks , 2007, Comput. Speech Lang..

[114]  Cyril Allauzen,et al.  Statistical Modeling for Unit Selection in Speech Synthesis , 2004, ACL.

[115]  Naonori Ueda,et al.  Variational bayesian estimation and clustering for speech recognition , 2004, IEEE Transactions on Speech and Audio Processing.

[116]  Masaaki Honda,et al.  Estimation of articulatory movements from speech acoustics using an HMM-based speech production model , 2004, IEEE Transactions on Speech and Audio Processing.

[117]  Mark J. F. Gales,et al.  Factor analysed hidden Markov models for speech recognition , 2004, Comput. Speech Lang..

[118]  Peder A. Olsen,et al.  Modeling inverse covariance matrices by basis expansion , 2002, IEEE Transactions on Speech and Audio Processing.

[119]  Homayounpour Mohammad Mahdi,et al.  FARSI SPEECH SYNTHESIS USING HIDDEN MARKOV MODEL AND DECISION TREES , 2004 .

[120]  Michael Picheny,et al.  A corpus-based approach to expressive speech synthesis , 2004, SSW.

[121]  Takao Kobayashi,et al.  A style control technique for HMM-based speech synthesis , 2004, INTERSPEECH.

[122]  Keiichi Tokuda,et al.  Decision-tree backing-off in HMM-based speech synthesis , 2004, INTERSPEECH.

[123]  Keiichi Tokuda,et al.  Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis , 2004, SSW.

[124]  Keiichi Tokuda,et al.  XIMERA: a new TTS from ATR based on corpus-based technologies , 2004, SSW.

[125]  Heiga Zen,et al.  Hidden semi-Markov model based speech synthesis , 2004, INTERSPEECH.

[126]  Toshio Hirai,et al.  Using 5 ms segments in concatenative speech synthesis , 2004, SSW.

[127]  R. Bakis,et al.  A CORPUS-BASED APPROACH TO < AHEM / > EXPRESSIVE SPEECH SYNTHESIS , 2004 .

[128]  Takayuki Ito,et al.  A concatenative speech synthesis method using context dependent phoneme sequences with variable length as search units , 2004, SSW.

[129]  Heiga Zen,et al.  An introduction of trajectory model into HMM-based speech synthesis , 2004, SSW.

[130]  Mark J. F. Gales,et al.  Switching linear dynamical systems for speech recognition , 2003 .

[131]  Takao Kobayashi,et al.  Modeling of various speaking styles and emotions for HMM-based speech synthesis , 2003, INTERSPEECH.

[132]  Kishore Prahallad,et al.  Unit size in unit selection speech synthesis , 2003, INTERSPEECH.

[133]  Heiga Zen,et al.  Towards the development of a brazilian portuguese text-to-speech system based on HMM , 2003, INTERSPEECH.

[134]  Heiga Zen,et al.  Decision tree-based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling , 2003, INTERSPEECH.

[135]  Jeff A. Bilmes,et al.  Buried Markov models: a graphical-modeling approach to automatic speech recognition , 2003, Comput. Speech Lang..

[136]  Alan W. Black Unit selection and emotional speech , 2003, INTERSPEECH.

[137]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[138]  Yu Shi,et al.  Power spectral density based channel equalization of large speech database for concatenative TTS system , 2002, INTERSPEECH.

[139]  Keiichi Tokuda,et al.  Eigenvoices for HMM-based speech synthesis , 2002, INTERSPEECH.

[140]  Tomohiro Nakatani,et al.  Evaluation of a speech recognition / generation method based on HMM and straight , 2002, INTERSPEECH.

[141]  Mohan Sondhi Articulatory modeling: a possible role in concatenative text-to-speech synthesis , 2002 .

[142]  H. Kawai,et al.  Study on time-dependent voice quality variation in a large-scale single speaker speech corpus used for speech synthesis , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[143]  Alan W. Black,et al.  Perfect synthesis for all of the people all of the time , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[144]  M. Ostendorf,et al.  A bootstrapping approach to automating prosodic annotation for limited-domain synthesis , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[145]  Mark J. F. Gales,et al.  Factor analysed hidden Markov models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[146]  Jeff A. Bilmes,et al.  Robust splicing costs and efficient search with BMM Models for concatenative speech synthesis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[147]  Keiichi Tokuda,et al.  Multi-Space Probability Distribution HMM , 2002 .

[148]  H. Zen,et al.  An HMM-based speech synthesis system applied to English , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[149]  Keiichi Tokuda,et al.  Investigation of State Duration Model based on Gamma distribution for HMM-based Speech Synthesis , 2001 .

[150]  Keiichi Tokuda,et al.  Vector Quantization of Speech Spectral Parameters Using Statistics of Static and Dynamic Features , 2001 .

[151]  Keiichi Tokuda,et al.  Mixed excitation for HMM-based speech synthesis , 2001, INTERSPEECH.

[152]  Sridha Sridharan,et al.  Trainable speech synthesis with trended hidden Markov models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[153]  Sebastian Ohnewald,et al.  Speech synthesis using stochastic Markov graphs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[154]  Shigeki Sagayama,et al.  Multiple-regression hidden Markov model , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[155]  Keiichi Tokuda,et al.  Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[156]  Chin-Hui Lee,et al.  A structural Bayes approach to speaker adaptation , 2001, IEEE Trans. Speech Audio Process..

[157]  Roland Kuhn,et al.  Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..

[158]  Rüdiger Hoffmann,et al.  A unified approach for speech synthesis and speech recognition using stochastic Markov graphs , 2000, INTERSPEECH.

[159]  Alan W. Black,et al.  Limited domain synthesis , 2000, INTERSPEECH.

[160]  Justin Fackrell,et al.  Segment selection in the L&h Realspeak laboratory TTS system , 2000, INTERSPEECH.

[161]  Michael W. Macon,et al.  Unit fusion for concatenative speech synthesis , 2000, INTERSPEECH.

[162]  Mark J. F. Gales Cluster adaptive training of hidden Markov models , 2000, IEEE Trans. Speech Audio Process..

[163]  Keiichi Tokuda,et al.  Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[164]  Koichi Shinoda,et al.  MDL-based context-dependent subword modeling for speech recognition , 2000 .

[165]  Paul Taylor,et al.  Speech synthesis by phonological structure matching , 1999, EUROSPEECH.

[166]  Alex Acero,et al.  Formant analysis and synthesis using hidden Markov models , 1999, EUROSPEECH.

[167]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[168]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[169]  Yannis Stylianou Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[170]  John S. Bridle,et al.  The HDM: a segmental hidden dynamic model of coarticulation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[171]  Marc C. Beutnagel,et al.  The AT & T NEXT-GEN TTS system , 1999 .

[172]  Keiichi Tokuda,et al.  Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.

[173]  Alex Acero,et al.  HMM-based smoothing for concatenative speech synthesis , 1998, ICSLP.

[174]  Keiichi Tokuda,et al.  Duration modeling for HMM-based speech synthesis , 1998, ICSLP.

[175]  Robert E. Donovan,et al.  The IBM trainable speech synthesis system , 1998, ICSLP.

[176]  Takehiko Kagoshima,et al.  Analytic generation of synthesis units by closed loop training for totally speaker driven text to speech system (TOS drive TTS) , 1998, ICSLP.

[177]  Orhan Karaali,et al.  Speech Synthesis with Neural Networks , 1998, ArXiv.

[178]  Alex Acero,et al.  Automatic generation of synthesis units for trainable text-to-speech systems , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[179]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[180]  Eric Moulines,et al.  Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..

[181]  William D. Penny,et al.  Hidden Markov models with extended observation densities , 1998 .

[182]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[183]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[184]  Paul Taylor,et al.  Automatically clustering similar units for unit selection in speech synthesis , 1997, EUROSPEECH.

[185]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[186]  Keiichi Tokuda,et al.  Voice characteristics conversion for HMM-based speech synthesis system , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[187]  Keiichi Tokuda,et al.  Speaker interpolation in HMM-based speech synthesis system , 1997, EUROSPEECH.

[188]  Richard M. Schwartz,et al.  A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[189]  Alex Acero,et al.  Whistler: a trainable text-to-speech system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[190]  Mark J. F. Gales,et al.  The generation and use of regression class trees for MLLR adaptation , 1996 .

[191]  Alan W. Black,et al.  Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[192]  Philip C. Woodland,et al.  Improvements in an HMM-based speech synthesiser , 1995, EUROSPEECH.

[193]  Jun-ichi Takahashi,et al.  Vector-field-smoothed Bayesian learning for incremental speaker adaptation , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[194]  K. Tokuda,et al.  Speech parameter generation from HMM using dynamic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[195]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[196]  Yoshinori Sagisaka,et al.  Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks , 1995, Speech Commun..

[197]  Peter R. Jones,et al.  Implementation and Evaluation , 1995 .

[198]  Jj Odell,et al.  The Use of Context in Large Vocabulary Speech Recognition , 1995 .

[199]  Paul Dalsgaard,et al.  Modelling intonation contours at the phrase level using continuous density hidden Markov models , 1994, Comput. Speech Lang..

[200]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[201]  Mari Ostendorf,et al.  A dynamical system model for generating F0 for synthesis , 1994, SSW.

[202]  S. Srihari Mixture Density Networks , 1994 .

[203]  Herbert Gish,et al.  A segmental speech model with applications to word spotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[204]  John Coleman,et al.  Acoustics of American English speech : a dynamic approach , 1993 .

[205]  Li Deng,et al.  A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal , 1992, Signal Process..

[206]  Keiichi Tokuda,et al.  An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[207]  Yoshinori Sagisaka,et al.  ATR μ-talk speech synthesis system , 1992, ICSLP.

[208]  Biing-Hwang Juang,et al.  New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[209]  Eric Moulines,et al.  Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[210]  Frank Fallside,et al.  Lexical stress recognition using hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[211]  Stephen E. Levinson,et al.  Continuously variable duration hidden Markov models for automatic speech recognition , 1986 .

[212]  Satoshi Imai,et al.  Cepstral analysis synthesis on the mel frequency scale , 1983, ICASSP.

[213]  S. Imai,et al.  Mel Log Spectrum Approximation (MLSA) filter for speech synthesis , 1983 .

[214]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[215]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[216]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[217]  H. Akaike A new look at the statistical model identification , 1974 .

[218]  John Nicholas Holmes,et al.  Speech synthesis , 1972 .

[219]  F. Itakura,et al.  A statistical method for estimation of speech spectral density and formant frequencies , 1970 .