Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis
暂无分享,去创建一个
Tomoki Toda | Satoshi Nakamura | Graham Neubig | Sakriani Sakti | Shinnosuke Takamichi | Alan W. Black | Graham Neubig | A. Black | S. Sakti | T. Toda | Satoshi Nakamura | Shinnosuke Takamichi
[1] Y. Sagisaka,et al. Speech synthesis by rule using an optimal selection of non-uniform synthesis units , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[2] Shigeru Katagiri,et al. A large-scale Japanese speech database , 1990, ICSLP.
[3] R. Plomp,et al. Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.
[4] R. Plomp,et al. Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.
[5] K. Tokuda,et al. Speech parameter generation from HMM using dynamic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[6] Joseph P. Olive,et al. Text-to-speech synthesis , 1995, AT&T Technical Journal.
[7] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[8] Misha Pavel,et al. Intelligibility of speech with filtered time trajectories of spectral envelopes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[9] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[10] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[11] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[12] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[13] Koichi Shinoda,et al. MDL-based context-dependent subword modeling for speech recognition , 2000 .
[14] Hideki Kawahara,et al. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT , 2001, MAVEBA.
[15] Les E. Atlas,et al. EURASIP Journal on Applied Signal Processing 2003:7, 668–675 c ○ 2003 Hindawi Publishing Corporation Joint Acoustic and Modulation Frequency , 2003 .
[16] Hervé Bourlard,et al. Mel-cepstrum modulation spectrum (MCMS) features for robust ASR , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).
[17] Alan W. Black,et al. CLUSTERGEN: a statistical parametric synthesizer using trajectory modeling , 2006, INTERSPEECH.
[18] Tomoki Toda,et al. Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation , 2006, INTERSPEECH.
[19] Takao Kobayashi,et al. Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training , 2007, IEICE Trans. Inf. Syst..
[20] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[21] Takashi Nose,et al. A Style Control Technique for HMM-Based Expressive Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[22] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[23] Heiga Zen,et al. A Hidden Semi-Markov Model-Based Speech Synthesis System , 2007, IEICE Trans. Inf. Syst..
[24] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[25] Yannis Stylianou,et al. Voice Transformation: A survey , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] Hynek Hermansky,et al. Phoneme recognition using spectral envelope and modulation frequency features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[27] Paul Taylor,et al. Text-to-Speech Synthesis , 2009 .
[28] S. King,et al. The Blizzard Challenge 2011 , 2011 .
[29] Tomoki Toda,et al. Speaker-Adaptive Speech Synthesis Based on Eigenvoice Conversion and Language-Dependent Prosodic Conversion in Speech-to-Speech Translation , 2011, INTERSPEECH.
[30] Kai Yu,et al. Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[31] J. Tao,et al. A State Duration Generation Algorithm Considering Global Variance for HMM-based Speech Synthesis , 2011 .
[32] Mark J. F. Gales,et al. Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive Training , 2012, INTERSPEECH.
[33] Tomoki Toda,et al. Implementation of Computationally Efficient Real-Time Voice Conversion , 2012, INTERSPEECH.
[34] Heiga Zen,et al. Product of Experts for Statistical Parametric Speech Synthesis , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[35] Tomoki Toda,et al. Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
[36] Moncef Gabbouj,et al. Voice Conversion Using Dynamic Kernel Partial Least Squares Regression , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[37] S. King,et al. Speech synthesis technologies for individuals with vocal disabilities: Voice banking and reconstruction , 2012 .
[38] Tetsuya Takiguchi,et al. Voice conversion in high-order eigen space using deep belief nets , 2013, INTERSPEECH.
[39] H. Timothy Bunnell,et al. Optimizations and fitting procedures for the liljencrants-fant model for statistical parametric speech synthesis , 2013, INTERSPEECH.
[40] Alan W. Black,et al. Text to speech in new languages without a standardized orthography , 2013, SSW.
[41] Heiga Zen,et al. Speech Synthesis Based on Hidden Markov Models , 2013, Proceedings of the IEEE.
[42] Kou Tanaka,et al. A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion , 2013, INTERSPEECH.
[43] William J. Byrne,et al. Fast, low-artifact speech synthesis considering global variance , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[44] Haizhou Li,et al. Synthetic speech detection using temporal modulation feature , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[45] Tomoki Toda,et al. Regression approaches to perceptual age control in singing voice conversion , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[46] Kai Yu,et al. An investigation of implementation and performance analysis of DNN based speech synthesis system , 2014, 2014 12th International Conference on Signal Processing (ICSP).
[47] Takashi Nose,et al. Statistical Parametric Speech Synthesis Based on Gaussian Process Regression , 2014, IEEE Journal of Selected Topics in Signal Processing.
[48] Tuomo Raitio,et al. DNN-based stochastic postfilter for HMM-based speech synthesis , 2014, INTERSPEECH.
[49] Takashi Nose,et al. A Parameter Generation Algorithm Using Local Variance for HMM-Based Speech Synthesis , 2014, IEEE Journal of Selected Topics in Signal Processing.
[50] Florian Eyben,et al. A frequency-weighted post-filtering transform for compensation of the over-smoothing effect in HMM-based speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[51] Tomoki Toda,et al. Modulation spectrum-based post-filter for GMM-based Voice Conversion , 2014, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific.
[52] Takashi Nose,et al. Analysis of spectral enhancement using global variance in HMM-based speech synthesis , 2014, INTERSPEECH.
[53] Heiga Zen,et al. Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[54] Kou Tanaka,et al. An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[55] Tomoki Toda,et al. Parameter Generation Methods With Rich Context Models for High-Quality and Flexible Text-To-Speech Synthesis , 2014, IEEE Journal of Selected Topics in Signal Processing.
[56] Tomoki Toda,et al. Modified post-filter to recover modulation spectrum for HMM-based speech synthesis , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[57] Alan W. Black,et al. Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[58] Tomoki Toda,et al. A postfilter to modify the modulation spectrum in HMM-based speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[59] Yannis Agiomyrgiannakis,et al. Vocaine the vocoder and applications in speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[60] Heiga Zen,et al. Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[61] Rainer Martin,et al. Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals , 2015, INTERSPEECH.
[62] Heiga Zen,et al. Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends , 2015, IEEE Signal Processing Magazine.
[63] Yoshihiko Nankaku,et al. The effect of neural networks in statistical parametric speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).