A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis
暂无分享,去创建一个
[1] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[2] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[3] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[4] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[5] Susan Fitt,et al. On generating combilex pronunciations via morphological analysis , 2010, INTERSPEECH.
[6] Bhuvana Ramabhadran,et al. An autoencoder neural-network based low-dimensionality approach to excitation modeling for HMM-based text-to-speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Geoffrey E. Hinton,et al. Binary coding of speech spectrograms using a deep auto-encoder , 2010, INTERSPEECH.
[8] S. King,et al. The Blizzard Challenge 2011 , 2011 .
[9] Tara N. Sainath,et al. Auto-encoder bottleneck features using deep belief networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Quoc V. Le,et al. Recurrent Neural Networks for Noise Reduction in Robust ASR , 2012, INTERSPEECH.
[11] Yasuo Horiuchi,et al. Reverberant speech recognition based on denoising autoencoder , 2013, INTERSPEECH.
[12] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[13] Dong Yu,et al. Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Florian Metze,et al. Extracting deep bottleneck features using stacked auto-encoders , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[15] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[16] Tuomo Raitio,et al. DNN-based stochastic postfilter for HMM-based speech synthesis , 2014, INTERSPEECH.
[17] Alan W. Black,et al. A Deep Learning Approach to Data-driven Parameterizations for Statistical Parametric Speech Synthesis , 2014, ArXiv.
[18] Bhuvana Ramabhadran,et al. Prosody contour prediction with long short-term memory, bi-directional, deep recurrent neural networks , 2014, INTERSPEECH.
[19] James R. Glass,et al. Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Lauri Juvela,et al. Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort , 2014, INTERSPEECH.
[21] Frank K. Soong,et al. TTS synthesis with bidirectional LSTM based recurrent neural networks , 2014, INTERSPEECH.
[22] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[23] Dimitri Palaz,et al. Raw Speech Signal-based Continuous Speech Recognition using Convolutional Neural Networks , 2014 .
[24] Masanori Morise,et al. CheapTrick, a spectral envelope estimator for high-quality speech synthesis , 2015, Speech Commun..
[25] Junichi Yamagishi,et al. Multiple feed-forward deep neural networks for statistical parametric speech synthesis , 2015, INTERSPEECH.
[26] Zhizheng Wu,et al. A Function-wise Pre-training Technique for Constructing a Deep Neural Network based Spectral Model in Statistical Parametric Speech Synthesis , 2015 .
[27] Masanori Morise,et al. Error Evaluation of an F0-Adaptive Spectral Envelope Estimator in Robustness against the Additive Noise and F0 Error , 2015, IEICE Trans. Inf. Syst..
[28] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.