Improving singing voice separation using Deep U-Net and Wave-U-Net with data augmentation
暂无分享,去创建一个
[1] John G Harris,et al. A sawtooth waveform inspired pitch estimator for speech and music. , 2008, The Journal of the Acoustical Society of America.
[2] A. Röbel. A NEW APPROACH TO TRANSIENT PROCESSING IN THE PHASE VOCODER , 2003 .
[3] Tillman Weyde,et al. Singing Voice Separation with Deep U-Net Convolutional Networks , 2017, ISMIR.
[4] Franck Giron,et al. Improving music source separation based on deep neural networks through data augmentation and network blending , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Fabian-Robert Stöter,et al. MUSDB18 - a corpus for music separation , 2017 .
[6] Naoya Takahashi,et al. Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).
[7] Juan Pablo Bello,et al. A Software Framework for Musical Data Augmentation , 2015, ISMIR.
[8] Bryan Pardo,et al. A simple music/voice separation method based on the extraction of the repeating musical structure , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Axel Röbel,et al. Shape-invariant speech transformation with the phase vocoder , 2010, INTERSPEECH.
[10] Arturo Camacho Lozano,et al. SWIPE: A Sawtooth Waveform Inspired Pitch Estimator for Speech and Music , 2011 .
[11] Emilia Gómez,et al. Monoaural Audio Source Separation Using Deep Convolutional Neural Networks , 2017, LVA/ICA.
[12] Antoine Liutkus,et al. The 2018 Signal Separation Evaluation Campaign , 2018, LVA/ICA.
[13] Gaël Richard,et al. Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Thomas Grill,et al. Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks , 2015, ISMIR.
[15] Simon Dixon,et al. Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation , 2018, ISMIR.
[16] Jean Laroche,et al. Improved phase vocoder time-scale modification of audio , 1999, IEEE Trans. Speech Audio Process..
[17] Matthias Mauch,et al. MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.
[18] X. Rodet,et al. Sound Analysis and Processing with AudioSculpt 2 , 2004, ICMC.
[19] Axel Röbel,et al. On cepstral and all-pole based spectral envelope modeling with unknown model order , 2007, Pattern Recognit. Lett..
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Shankar Vembu,et al. Separation of Vocals from Polyphonic Audio Recordings , 2005, ISMIR.
[22] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[23] X. Rodet. EFFICIENT SPECTRAL ENVELOPE ESTIMATION AND ITS APPLICATION TO PITCH SHIFTING AND ENVELOPE PRESERVATION , 2005 .
[24] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[25] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.