A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis
暂无分享,去创建一个
[1] Heiga Zen,et al. Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends , 2015, IEEE Signal Processing Magazine.
[2] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[3] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[4] Vassilis Tsiaras,et al. ON the Use of Wavenet as a Statistical Vocoder , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Thomas Drugman,et al. Towards Achieving Robust Universal Neural Vocoding , 2018, INTERSPEECH.
[6] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Bajibabu Bollepalli,et al. Speaker-independent raw waveform model for glottal excitation , 2018, INTERSPEECH.
[8] Heiga Zen,et al. Speech Synthesis Based on Hidden Markov Models , 2013, Proceedings of the IEEE.
[9] Thomas Drugman,et al. Robust universal neural vocoding , 2018, ArXiv.
[10] Zhen-Hua Ling,et al. Dnn-based Spectral Enhancement for Neural Waveform Generators with Low-bit Quantization , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[12] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.
[13] Adam Finkelstein,et al. Fftnet: A Real-Time Speaker-Dependent Neural Vocoder , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Frank K. Soong,et al. LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis , 2018, 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[15] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.
[16] Jan Skoglund,et al. LPCNET: Improving Neural Speech Synthesis through Linear Prediction , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Lars M. Mescheder,et al. On the convergence properties of GAN training , 2018, ArXiv.
[18] Zhizheng Wu,et al. Merlin: An Open Source Neural Network Speech Synthesis System , 2016, SSW.
[19] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[20] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[22] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[23] Li-Rong Dai,et al. Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[25] Xin Wang,et al. Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[27] Bajibabu Bollepalli,et al. GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Cong Zhou,et al. High-quality Speech Coding with Sample RNN , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[30] Xin Wang,et al. Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Li-Rong Dai,et al. WaveNet Vocoder with Limited Training Data for Voice Conversion , 2018, INTERSPEECH.
[32] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[33] Sabine Buchholz,et al. Crowdsourcing Preference Tests, and How to Detect Cheating , 2011, INTERSPEECH.
[34] Sebastian Nowozin,et al. Which Training Methods for GANs do actually Converge? , 2018, ICML.
[35] Gunnar Fant,et al. Acoustic Theory Of Speech Production , 1960 .
[36] Xi Wang,et al. A New Glottal Neural Vocoder for Speech Synthesis , 2018, INTERSPEECH.
[37] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[38] Tomoki Toda,et al. An investigation of multi-speaker training for wavenet vocoder , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[39] Frank K. Soong,et al. TTS synthesis with bidirectional LSTM based recurrent neural networks , 2014, INTERSPEECH.
[40] Phil Clendeninn. The Vocoder , 1940, Nature.
[41] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[42] Jonathan Harrington,et al. The Acoustic Theory of Speech Production , 1999 .
[43] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[44] Tomoki Toda,et al. Statistical Voice Conversion with WaveNet-Based Waveform Generation , 2017, INTERSPEECH.
[45] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[46] Zhen-Hua Ling,et al. Samplernn-Based Neural Vocoder for Statistical Parametric Speech Synthesis , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[47] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.