论文信息 - Singing Voice Separation using Generative Adversarial Networks

Singing Voice Separation using Generative Adversarial Networks

In this paper, we propose a novel approach extending Wasserstein generative adversarial networks (GANs) [3] to separate singing voice from the mixture signal. We used the mixture signal as a condition to generate singing voices and applied the U-net style network for the stable training of the model. Experiments with the DSD100 dataset show the promising results with the potential of using the GANs for music source separation.

Hyeong-seok Choi | Hyeong-Seok Choi

[1] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[2] Olaf Ronneberger,et al. Invited Talk: U-Net Convolutional Networks for Biomedical Image Segmentation , 2017, Bildverarbeitung für die Medizin.

[3] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[5] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[6] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[7] Jacob D. Abernethy,et al. How to Train Your DRAGAN , 2017, ArXiv.

[8] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.