Cycle-consistent Adversarial Networks for Non-parallel Vocal Effort Based Speaking Style Conversion
暂无分享,去创建一个
Lauri Juvela | Junichi Yamagishi | Shreyas Seshadri | Paavo Alku | Okko Räsänen | J. Yamagishi | Lauri Juvela | P. Alku | O. Räsänen | Shreyas Seshadri
[1] Paavo Alku,et al. Comparison of Gaussian process regression and Gaussian mixture models in spectral tilt modelling for intelligibility enhancement of telephone speech , 2015, INTERSPEECH.
[2] Christophe d'Alessandro,et al. Experiments in voice quality modification of natural speech signals: the spectral approach , 1998, SSW.
[3] L. Braida,et al. Speaking clearly for the hard of hearing IV: Further studies of the role of speaking rate. , 1996, Journal of speech and hearing research.
[4] Peter F. Driessen,et al. Transforming Perceived Vocal Effort and Breathiness Using Adaptive Pre-Emphasis Linear Prediction , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[6] Paavo Alku,et al. Spectral tilt modelling with GMMs for intelligibility enhancement of narrowband telephone speech , 2014, INTERSPEECH.
[7] Koby Crammer,et al. Non-parallel voice conversion using joint optimization of alignment by temporal context and spectral distortion , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Hirokazu Kameoka,et al. CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[9] Masanori Sugimoto,et al. Whisper to normal speech conversion using pitch estimated from spectrum , 2016, Speech Commun..
[10] Susanto Rahardja,et al. Lombard effect mimicking , 2010, SSW.
[11] Bajibabu Bollepalli,et al. GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis , 2016, INTERSPEECH.
[12] Tanja Schultz,et al. Fundamental frequency generation for whisper-to-audible speech conversion , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Hemant A. Patil,et al. Effectiveness of Dynamic Features in INCA and Temporal Context-INCA , 2018, INTERSPEECH.
[14] H. Lane,et al. The Lombard Sign and the Role of Hearing in Speech , 1971 .
[15] John H. L. Hansen,et al. Analysis and Compensation of Lombard Speech Across Noise Type and Levels With Application to In-Set/Out-of-Set Speaker Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Daniel Erro,et al. INCA Algorithm for Training Voice Conversion Systems From Nonparallel Corpora , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Gaël Richard,et al. Speech intelligibility improvement in car noise environment by voice transformation , 2017, Speech Commun..
[18] Zhizheng Wu,et al. Analysis of the Voice Conversion Challenge 2016 Evaluation Results , 2016, INTERSPEECH.
[19] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[20] Àngel Calzada Defez,et al. Vocal Effort Modification through Harmonics Plus Noise Model Representation , 2011, NOLISP.
[21] M. Picheny,et al. Speaking clearly for the hard of hearing. II: Acoustic characteristics of clear and conversational speech. , 1986, Journal of speech and hearing research.
[22] Prasanta Kumar Ghosh,et al. Whispered Speech to Neutral Speech Conversion Using Bidirectional LSTMs , 2018, INTERSPEECH.
[24] Mark J. F. Gales,et al. A Log Domain Pulse Model for Parametric Speech Synthesis , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[25] Lior Wolf,et al. Unsupervised Cross-Domain Image Generation , 2016, ICLR.
[26] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.
[27] Heming Zhao,et al. Reconstruction of Normal Speech from Whispered Speech Based on RBF Neural Network , 2010, 2010 Third International Symposium on Intelligent Information Technology and Security Informatics.
[28] Junichi Yamagishi,et al. High-Quality Nonparallel Voice Conversion Based on Cycle-Consistent Adversarial Network , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[30] Bhuvana Ramabhadran,et al. Bias and Statistical Significance in Evaluating Speech Synthesis with Mean Opinion Scores , 2017, INTERSPEECH.
[31] Mark A. Clements,et al. Reconstruction of speech from whispers , 2002, MAVEBA.
[32] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[33] Method for the subjective assessment of intermediate quality level of , 2014 .
[34] Hemant A. Patil,et al. Unsupervised Vocal Tract Length Warped Posterior Features for Non-Parallel Voice Conversion , 2018, INTERSPEECH.
[35] Lauri Juvela,et al. Speaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs , 2017, INTERSPEECH.
[36] Paavo Alku,et al. The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions , 2016, INTERSPEECH.
[37] Lauri Juvela,et al. Vocal Effort Based Speaking Style Conversion Using Vocoder Features and Parallel Learning , 2019, IEEE Access.