Electrolaryngeal Speech Enhancement with Statistical Voice Conversion based on CLDNN
暂无分享,去创建一个
[1] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[2] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[3] Kou Tanaka,et al. A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation , 2014, IEICE Trans. Inf. Syst..
[4] Hideki Kawahara,et al. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT , 2001, MAVEBA.
[5] Tomoki Toda,et al. Statistical Voice Conversion with WaveNet-Based Waveform Generation , 2017, INTERSPEECH.
[6] Tetsuya Takiguchi,et al. Voice conversion in high-order eigen space using deep belief nets , 2013, INTERSPEECH.
[7] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[8] Tomoki Toda,et al. The Voice Conversion Challenge 2016 , 2016, INTERSPEECH.
[9] Yariv Ephraim,et al. A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..
[10] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Tomoki Toda,et al. Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech , 2012, Speech Commun..
[12] Yu Tsao,et al. Voice conversion from non-parallel corpora using variational auto-encoder , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).
[13] A.V. Oppenheim,et al. Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.
[14] S. Imai,et al. Mel Log Spectrum Approximation (MLSA) filter for speech synthesis , 1983 .
[15] Joseph Sylvester Chang,et al. A parametric formulation of the generalized spectral subtraction method , 1998, IEEE Trans. Speech Audio Process..
[16] Li-Rong Dai,et al. Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Kun Li,et al. Voice conversion using deep Bidirectional Long Short-Term Memory based Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Tomoki Toda,et al. An evaluation of alaryngeal speech enhancement methods based on voice conversion techniques , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .
[20] Kuldip K. Paliwal,et al. Spectral Subtraction With Variance Reduced Noise Spectrum Estimates , 2006 .
[21] Ian McLoughlin,et al. Speech reconstruction using a deep partially supervised neural network , 2017, Healthcare technology letters.
[22] Shinnosuke Takamichi,et al. Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.