ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder
暂无分享,去创建一个
Kou Tanaka | Hirokazu Kameoka | Takuhiro Kaneko | Nobukatsu Hojo | H. Kameoka | Takuhiro Kaneko | Kou Tanaka | Nobukatsu Hojo
[1] Jung-Woo Ha,et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[2] Mikihiro Nakagiri,et al. Statistical Voice Conversion Techniques for Body-Conducted Unvoiced Speech Enhancement , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[4] Ole Winther,et al. Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.
[5] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[7] Tomoki Toda,et al. Speaker-Dependent WaveNet Vocoder , 2017, INTERSPEECH.
[8] Marc Schröder,et al. Evaluation of Expressive Speech Synthesis With Voice Conversion and Copy Resynthesis Techniques , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[11] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[12] Steve J. Young,et al. Data-driven emotion conversion in spoken English , 2009, Speech Commun..
[13] Tetsuya Takiguchi,et al. High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion , 2014, INTERSPEECH.
[14] Keiichi Tokuda,et al. An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[15] Kun Li,et al. Voice conversion using deep Bidirectional Long Short-Term Memory based Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[17] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[18] Jordi Bonada,et al. Observation-model error compensation for enhanced spectral envelope transformation in voice conversion , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).
[19] Tomoki Toda,et al. Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech , 2012, Speech Commun..
[20] Kou Tanaka,et al. StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[21] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[22] Jordi Bonada,et al. Modeling and Transforming Speech Using Variational Autoencoders , 2016, INTERSPEECH.
[23] Haifeng Li,et al. A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences , 2016, INTERSPEECH.
[24] Kishore Prahallad,et al. Spectral Mapping Using Artificial Neural Networks for Voice Conversion , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[25] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.
[26] Hirokazu Kameoka,et al. Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks , 2017, INTERSPEECH.
[27] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[29] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[30] Tomoki Toda,et al. Post-Filters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis , 2016 .
[31] Kou Tanaka,et al. Synthetic-to-Natural Speech Waveform Conversion Using Cycle-Consistent Adversarial Networks , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[32] Tsao Yu,et al. Voice conversion from non-parallel corpora using variational auto-encoder , 2016 .
[33] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[34] Yu Tsao,et al. Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks , 2017, INTERSPEECH.
[35] Florin Curelaru,et al. Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).
[36] Adam Finkelstein,et al. Fftnet: A Real-Time Speaker-Dependent Neural Vocoder , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Moncef Gabbouj,et al. Voice Conversion Using Partial Least Squares Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[38] Tomoki Toda,et al. Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[39] Yonghong Yan,et al. High Quality Voice Conversion through Phoneme-Based Linear Mapping Functions with STRAIGHT for Mandarin , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).
[40] Lauri Juvela,et al. Non-parallel voice conversion using i-vector PLDA: towards unifying speaker verification and transformation , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[41] Sungwon Kim,et al. FloWaveNet : A Generative Flow for Raw Audio , 2018, ICML.
[42] Hyunsoo Kim,et al. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.
[43] Junichi Yamagishi,et al. The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods , 2018, Odyssey.
[44] Tetsuya Takiguchi,et al. Exemplar-Based Voice Conversion Using Sparse Representation in Noisy Environments , 2013, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..
[45] David Barber,et al. Information Maximization in Noisy Channels : A Variational Approach , 2003, NIPS.
[46] John-Paul Hosom,et al. Improving the intelligibility of dysarthric speech , 2007, Speech Commun..
[47] Tetsuya Takiguchi,et al. Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines , 2014, IEICE Trans. Inf. Syst..
[48] Li-Rong Dai,et al. Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[49] Ping Tan,et al. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[50] Wei Ping,et al. ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech , 2018, ICLR.
[51] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[52] Hirokazu Kameoka,et al. CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[53] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[54] Xin Wang,et al. Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[55] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[56] Kou Tanaka,et al. ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder , 2018, ArXiv.
[57] Jonathon Shlens,et al. Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.
[58] Shinnosuke Takamichi,et al. Non-Parallel Voice Conversion Using Variational Autoencoders Conditioned by Phonetic Posteriorgrams and D-Vectors , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[59] Tomoki Toda,et al. sprocket: Open-Source Voice Conversion Software , 2018, Odyssey.
[60] Seyed Hamidreza Mohammadi,et al. Voice conversion using deep neural networks with speaker-independent pre-training , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).