Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform Model
暂无分享,去创建一个
[1] A. Rosenberg. Effect of glottal pulse shape on the quality of natural vowels. , 1969 .
[2] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.
[3] Karen Simonyan,et al. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders , 2017, ICML.
[4] Seppo J. Ovaska,et al. Speech signal restoration using an optimal neural network structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).
[5] Xin Wang,et al. Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis , 2019, ArXiv.
[6] Yoshua Bengio,et al. MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis , 2019, NeurIPS.
[7] Thierry Dutoit,et al. The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Heiga Zen,et al. Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tuomo Raitio,et al. Excitation modeling for HMM-based speech synthesis: Breaking down the impact of periodic and aperiodic components , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Heiga Zen,et al. An excitation model for HMM-based speech synthesis based on residual modeling , 2007, SSW.
[11] Paavo Alku,et al. HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Peter Ladefoged,et al. Phonation types: a cross-linguistic overview , 2001, J. Phonetics.
[13] A. Rosenberg. Effect of glottal pulse shape on the quality of natural vowels. , 1969, The Journal of the Acoustical Society of America.
[14] Bajibabu Bollepalli,et al. GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram , 2019, INTERSPEECH.
[15] Kumar Krishna Agrawal,et al. GANSynth: Adversarial Neural Audio Synthesis , 2019, ICLR.
[16] Lauri Juvela,et al. Transferring Neural Speech Waveform Synthesizers to Musical Instrument Sounds Generation , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Chenjie Gu,et al. DDSP: Differentiable Digital Signal Processing , 2020, ICLR.
[18] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[19] Simon King,et al. Speech Waveform Reconstruction Using Convolutional Neural Networks with Noise and Periodic Inputs , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Ryuichi Yamamoto,et al. Parallel Wavegan: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[22] Stefano Ermon,et al. Audio Super Resolution using Neural Networks , 2017, ICLR.
[23] Bajibabu Bollepalli,et al. GlotNet—A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Xin Wang,et al. Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Alex Waibel,et al. Noise reduction using connectionist models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[27] Zhen-Hua Ling,et al. A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Xin Wang,et al. Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Junichi Yamagishi,et al. Towards an improved modeling of the glottal source in statistical parametric speech synthesis , 2007, SSW.
[30] Lauri Juvela,et al. Phase perception of the glottal excitation and its relevance in statistical parametric speech synthesis , 2016, Speech Commun..
[31] Jan Skoglund,et al. LPCNET: Improving Neural Speech Synthesis through Linear Prediction , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] W. Bastiaan Kleijn,et al. On phase perception in speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[33] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] John Kane,et al. Data-driven detection and analysis of the patterns of creaky voice , 2014, Comput. Speech Lang..