An Investigation of Noise Shaping with Perceptual Weighting for Wavenet-Based Speech Generation
暂无分享,去创建一个
Tomoki Toda | Hisashi Kawai | Yoshinori Shiga | Kentaro Tachibana | T. Toda | Kentaro Tachibana | H. Kawai | Y. Shiga
[1] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[2] B. Atal,et al. Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .
[3] Yoshua Bengio,et al. Char2Wav: End-to-End Speech Synthesis , 2017, ICLR.
[4] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[5] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[6] Arturo Camacho Lozano,et al. SWIPE: A Sawtooth Waveform Inspired Pitch Estimator for Speech and Music , 2011 .
[7] David Talkin,et al. A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .
[8] Zhen-Hua Ling,et al. Waveform Modeling Using Stacked Dilated Convolutional Neural Networks for Speech Bandwidth Extension , 2017, INTERSPEECH.
[9] B. Atal,et al. Predictive coding of speech signals and subjective error criteria , 1979 .
[10] Keiichi Tokuda,et al. An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[11] Tomoki Toda,et al. Subband wavenet with overlapped single-sideband filterbanks , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Shigeru Katagiri,et al. ATR Japanese speech database as a tool of speech recognition and synthesis , 1990, Speech Commun..
[14] B. S. Atal,et al. PREDICTIVE CODING OF SPEECH USING ANALYSIS-BY-SYNTHESIS TECHNIQUES , 1990, 1990 Conference Record Twenty-Fourth Asilomar Conference on Signals, Systems and Computers, 1990..
[15] Yannis Agiomyrgiannakis,et al. Google's Next-Generation Real-Time Unit-Selection Synthesizer Using Sequence-to-Sequence LSTM-Based Autoencoders , 2017, INTERSPEECH.
[16] Masanori Morise,et al. D4C, a band-aperiodicity estimator for high-quality speech synthesis , 2016, Speech Commun..
[17] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[18] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, IEEE International Conference on Acoustics, Speech, and Signal Processing.
[19] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[20] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.
[22] Adam Coates,et al. Deep Voice: Real-time Neural Text-to-Speech , 2017, ICML.
[23] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[24] Tomoki Toda,et al. Statistical Voice Conversion with WaveNet-Based Waveform Generation , 2017, INTERSPEECH.
[25] Mark Hasegawa-Johnson,et al. Speech Enhancement Using Bayesian Wavenet , 2017, INTERSPEECH.
[26] Tomoki Toda,et al. An investigation of multi-speaker training for wavenet vocoder , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[27] Heiga Zen,et al. Speech Synthesis Based on Hidden Markov Models , 2013, Proceedings of the IEEE.
[28] Alexander Gutkin,et al. Recent Advances in Google Real-Time HMM-Driven Unit Selection Synthesizer , 2016, INTERSPEECH.
[29] Thomas S. Huang,et al. Fast Wavenet Generation Algorithm , 2016, ArXiv.
[30] Heiga Zen,et al. Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices , 2016, INTERSPEECH.
[31] Sercan Ömer Arik,et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech , 2017, NIPS.
[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).