Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis
暂无分享,去创建一个
[1] F. Itakura,et al. A statistical method for estimation of speech spectral density and formant frequencies , 1970 .
[2] F. Itakura. Line spectrum representation of linear predictor coefficients of speech signals , 1975 .
[3] K. Dzhaparidze. Parameter estimation and hypothesis testing in spectral analysis of stationary time series , 1986 .
[4] Alan V. Oppenheim,et al. Discrete-Time Signal Pro-cessing , 1989 .
[5] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[6] Jj Odell,et al. The Use of Context in Large Vocabulary Speech Recognition , 1995 .
[7] Keiichi Tokuda,et al. Adaptive cepstral analysis of speech , 1995, IEEE Trans. Speech Audio Process..
[8] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[9] Hideki Kawahara,et al. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT , 2001, MAVEBA.
[10] Keiichi Tokuda,et al. Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis , 2005 .
[11] Ren-Hua Wang,et al. Minimum Generation Error Training for HMM-Based Speech Synthesis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[12] 全 炳河,et al. Reformulating HMM as a trajectory model by imposing explicit relationships between static and dynamic features , 2006 .
[13] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[14] Heiga Zen,et al. Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences , 2007, Comput. Speech Lang..
[15] Keiichi Tokuda,et al. Minimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis , 2008, INTERSPEECH.
[16] Keiichi Tokuda,et al. Statistical approach to vocal tract transfer function estimation based on factor analyzed trajectory HMM , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Heiga Zen,et al. Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters , 2010, SSW.
[18] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[19] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[20] Heiga Zen,et al. Deep learning in speech synthesis , 2013, SSW.
[21] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[22] Yoshihiko Nankaku,et al. Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis , 2014, IEICE Trans. Inf. Syst..
[23] Heiga Zen,et al. Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).