Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000
暂无分享,去创建一个
[1] Masanobu Abe,et al. Cross-language voice conversion , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[2] K. Tokuda,et al. Speech parameter generation from HMM using dynamic features , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[3] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[4] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[5] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[6] Kiyohiro Shikano,et al. Cross-language Voice Conversion Evaluation Using Bilingual Databases , 2002 .
[7] H. Ney,et al. VTLN-based cross-language voice conversion , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).
[8] Athanasios Mouchtaris,et al. Nonparallel training for voice conversion based on a parameter adaptation approach , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Chung-Hsien Wu,et al. Map-based adaptation for speech conversion using adaptation data selection and non-parallel training , 2006, INTERSPEECH.
[10] Daniel Erro,et al. Frame alignment method for cross-lingual voice conversion , 2007, INTERSPEECH.
[11] Tsuyoshi Masuda,et al. Cost Reduction of Training Mapping Function Based on Multistep Voice Conversion , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[12] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Tomoki Toda,et al. Cross-language voice conversion based on eigenvoices , 2009, INTERSPEECH.
[14] Tomoki Toda,et al. Many-to-many eigenvoice conversion with reference voice , 2009, INTERSPEECH.
[15] Daniel Erro,et al. INCA Algorithm for Training Voice Conversion Systems From Nonparallel Corpora , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Kishore Prahallad,et al. Spectral Mapping Using Artificial Neural Networks for Voice Conversion , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Frank K. Soong,et al. A frame mapping based HMM approach to cross-lingual voice transformation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Tomoki Toda,et al. Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
[19] Tetsuya Takiguchi,et al. Exemplar-based voice conversion in noisy environment , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[20] Tetsuya Takiguchi,et al. Voice conversion in high-order eigen space using deep belief nets , 2013, INTERSPEECH.
[21] Peng Song,et al. Non-parallel training for voice conversion based on adaptation method , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Li-Rong Dai,et al. Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[23] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Heiga Zen,et al. Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[27] Tetsuya Takiguchi,et al. Multiple Non-Negative Matrix Factorization for Many-to-Many Voice Conversion , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[29] Haifeng Li,et al. A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences , 2016, INTERSPEECH.
[30] Tsao Yu,et al. Voice conversion from non-parallel corpora using variational auto-encoder , 2016 .
[31] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[32] Yu Tsao,et al. Locally Linear Embedding for Exemplar-Based Spectral Conversion , 2016, INTERSPEECH.
[33] Tetsuya Takiguchi,et al. Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[34] Hao Wang,et al. Phonetic posteriorgrams for many-to-one voice conversion without parallel data training , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[35] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[36] Yu Tsao,et al. Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks , 2017, INTERSPEECH.
[37] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[38] Yu Zhang,et al. Learning Latent Representations for Speech Generation and Transformation , 2017, INTERSPEECH.
[39] Tomoki Toda,et al. An investigation of multi-speaker training for wavenet vocoder , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[40] Hirokazu Kameoka,et al. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks , 2017, ArXiv.
[41] Tomoki Toda,et al. Statistical Voice Conversion with WaveNet-Based Waveform Generation , 2017, INTERSPEECH.
[42] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[43] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[44] Tomoki Toda,et al. An Evaluation of Deep Spectral Mappings and WaveNet Vocoder for Voice Conversion , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[45] Haizhou Li,et al. A Voice Conversion Framework with Tandem Feature Sparse Representation and Speaker-Adapted WaveNet Vocoder , 2018, INTERSPEECH.
[46] Tomoki Toda,et al. The NU Non-Parallel Voice Conversion System for the Voice Conversion Challenge 2018 , 2018, Odyssey.
[47] Xi Wang,et al. A New Glottal Neural Vocoder for Speech Synthesis , 2018, INTERSPEECH.
[48] Tomoki Toda,et al. An Investigation of Noise Shaping with Perceptual Weighting for Wavenet-Based Speech Generation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Junichi Yamagishi,et al. High-Quality Nonparallel Voice Conversion Based on Cycle-Consistent Adversarial Network , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] Cecilia Jarne. An heuristic approach to obtain signal envelope with a simple software implementation , 2017 .
[51] Adam Finkelstein,et al. Fftnet: A Real-Time Speaker-Dependent Neural Vocoder , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[52] Tomoki Toda,et al. Collapsed speech segment detection and suppression for WaveNet vocoder , 2018, INTERSPEECH.
[53] Bajibabu Bollepalli,et al. Speaker-independent raw waveform model for glottal excitation , 2018, INTERSPEECH.
[54] Li-Rong Dai,et al. WaveNet Vocoder with Limited Training Data for Voice Conversion , 2018, INTERSPEECH.
[55] Tomoki Toda,et al. NU Voice Conversion System for the Voice Conversion Challenge 2018 , 2018, Odyssey.
[56] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[57] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[58] Junichi Yamagishi,et al. The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods , 2018, Odyssey.
[59] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[60] Yoshihiko Nankaku,et al. Deep neural network based real-time speech vocoder with periodic and aperiodic inputs , 2019, 10th ISCA Workshop on Speech Synthesis (SSW 10).
[61] Tomoki Toda,et al. Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion , 2018, 2019 27th European Signal Processing Conference (EUSIPCO).
[62] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.
[63] Haizhou Li,et al. Cross-lingual Voice Conversion with Bilingual Phonetic Posteriorgram and Average Modeling , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[64] Sungwon Kim,et al. FloWaveNet : A Generative Flow for Raw Audio , 2018, ICML.
[65] Jan Skoglund,et al. LPCNET: Improving Neural Speech Synthesis through Linear Prediction , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[66] Xin Wang,et al. Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[67] Tomoki Toda,et al. Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation , 2019, INTERSPEECH.
[68] Tomoki Toda,et al. The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders , 2020, Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020.
[69] Tomoki Toda,et al. A Cyclical Post-filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-speech Systems , 2020, INTERSPEECH.