论文信息 - Singing Style Transfer Using Cycle-Consistent Boundary Equilibrium Generative Adversarial Networks

Singing Style Transfer Using Cycle-Consistent Boundary Equilibrium Generative Adversarial Networks

Can we make a famous rap singer like Eminem sing whatever our favorite song? Singing style transfer attempts to make this possible, by replacing the vocal of a song from the source singer to the target singer. This paper presents a method that learns from unpaired data for singing style transfer using generative adversarial networks.

Yi-Hsuan Yang | Jyh-Shing Roger Jang | Jen-Yu Liu | Cheng-Wei Wu

[1] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[2] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[3] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Tillman Weyde,et al. Singing Voice Separation with Deep U-Net Convolutional Networks , 2017, ISMIR.

[5] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] David Berthelot,et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[7] Jae Lim,et al. Signal estimation from modified short-time Fourier transform , 1984 .

[8] Yi-Hsuan Yang,et al. Vocal activity informed singing voice separation with the iKala dataset , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Yi-Hsuan Yang,et al. Denoising Auto-Encoder with Recurrent Skip Connections and Residual Regression for Music Source Separation , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[11] Chuan Li,et al. Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[12] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[13] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Yi-Hsuan Yang,et al. Event Localization in Music Auto-tagging , 2016, ACM Multimedia.