Singing voice separation using a deep convolutional neural network trained by ideal binary mask and cross entropy
暂无分享,去创建一个
Dorien Herremans | Simon Lui | Kin Wah Edward Lin | T. BalamuraliB. | Enyan Koh | Dorien Herremans | Simon Lui | T. BalamuraliB. | Enyan Koh
[1] Tian Feng,et al. Modelling Mutual Information between Voiceprint and Optimal Number of Mel-Frequency Cepstral Coefficients in Voice Discrimination , 2014, 2014 13th International Conference on Machine Learning and Applications.
[2] Michael O'Neill,et al. The Use of Mel-frequency Cepstral Coefficients in Musical Instrument Identification , 2008, ICMC.
[3] Tuomas Virtanen,et al. Automatic Recognition of Lyrics in Singing , 2010, EURASIP J. Audio Speech Music. Process..
[4] Emmanuel Vincent,et al. A General Flexible Framework for the Handling of Prior Information in Audio Source Separation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Jonathan Le Roux,et al. Deep clustering and conventional networks for music separation: Stronger together , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Mark D. Plumbley,et al. Evaluation of audio source separation models using hypothesis-driven non-parametric statistical methods , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).
[7] Derry Fitzgerald,et al. Single Channel Vocal Separation using Median Filtering and Factorisation Techniques , 2010 .
[8] Bryan Pardo,et al. Music/Voice Separation Using the Similarity Matrix , 2012, ISMIR.
[9] Simon Lui,et al. Visualising Singing Style under Common Musical Events Using Pitch-Dynamics Trajectories and Modified TRACLUS Clustering , 2014, 2014 13th International Conference on Machine Learning and Applications.
[10] Kyogu Lee,et al. Singing Voice Separation Using RPCA with Weighted l_1 -norm , 2017, LVA/ICA.
[11] Simon Dixon,et al. Jointly Detecting and Separating Singing Voice: A Multi-Task Approach , 2018, LVA/ICA.
[12] G. Kramer. Auditory Scene Analysis: The Perceptual Organization of Sound by Albert Bregman (review) , 2016 .
[13] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Antoine Liutkus,et al. An Overview of Lead and Accompaniment Separation in Music , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[16] Alan V. Oppenheim,et al. Discrete-time signal processing (2nd ed.) , 1999 .
[17] Hiromasa Fujihara,et al. Timbre and Melody Features for the Recognition of Vocal Activity and Instrumental Solos in Polyphonic Music , 2011, ISMIR.
[18] Antoine Liutkus,et al. Adaptive filtering for music/voice separation exploiting the repeating musical structure , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Katsutoshi Itoyama,et al. Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[22] Antoine Liutkus,et al. Common fate model for unison source separation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Gaël Richard,et al. A Musically Motivated Mid-Level Representation for Pitch Estimation and Musical Audio Source Separation , 2011, IEEE Journal of Selected Topics in Signal Processing.
[24] Paris Smaragdis,et al. Singing-voice separation from monaural recordings using robust principal component analysis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Simon Lui,et al. Sinusoidal Partials Tracking for Singing Analysis Using the Heuristic of the Minimal Frequency and Magnitude Difference , 2017, INTERSPEECH.
[26] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.
[27] Franck Giron,et al. Improving music source separation based on deep neural networks through data augmentation and network blending , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Paris Smaragdis,et al. Singing-Voice Separation from Monaural Recordings using Deep Recurrent Neural Networks , 2014, ISMIR.
[29] J. Eggert,et al. Sparse coding and NMF , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).
[30] Emilia Gómez,et al. An Analysis/Synthesis Framework for Automatic F0 Annotation of Multitrack Datasets , 2017, ISMIR.
[31] Emmanuel Vincent,et al. Multichannel Audio Source Separation With Deep Neural Networks , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[32] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] G. Sapiro,et al. A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.
[34] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.
[35] Michael A. Casey,et al. Separation of Mixed Audio Sources By Independent Subspace Analysis , 2000, ICMC.
[36] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[37] Ching-Hua Chuan,et al. Modeling Temporal Tonal Relations in Polyphonic Music Through Deep Networks With a Novel Image-Based Representation , 2018, AAAI.
[38] Franck Giron,et al. Deep neural network based instrument extraction from music , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Kyogu Lee,et al. Vocal Separation from Monaural Music Using Temporal/Spectral Continuity and Sparsity Constraints , 2014, IEEE Signal Processing Letters.
[40] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[41] Bob L. Sturm,et al. Musical instrument identification using multiscale Mel-frequency cepstral coefficients , 2010, 2010 18th European Signal Processing Conference.
[42] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.
[43] Benjamin Schrauwen,et al. Deep content-based music recommendation , 2013, NIPS.
[44] Jyh-Shing Roger Jang,et al. SVSGAN: Singing Voice Separation Via Generative Adversarial Network , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Tillman Weyde,et al. Singing Voice Separation with Deep U-Net Convolutional Networks , 2017, ISMIR.
[46] Ching-Hua Chuan,et al. A Functional Taxonomy of Music Generation Systems , 2017, ACM Comput. Surv..
[47] Yi-Hsuan Yang,et al. Vocal activity informed singing voice separation with the iKala dataset , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Jyh-Shing Roger Jang,et al. Singing Voice Separation and Pitch Extraction from Monaural Polyphonic Audio Music via DNN and Adaptive Pitch Tracking , 2016, 2016 IEEE Second International Conference on Multimedia Big Data (BigMM).
[49] Tristan Jehan,et al. Mining Labeled Data from Web-Scale Collections for Vocal Activity Detection in Music , 2017, ISMIR.
[50] Paris Smaragdis,et al. Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[51] Tuomas Virtanen,et al. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[52] Antoine Liutkus,et al. Scalable audio separation with light Kernel Additive Modelling , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[53] E. C. Cmm,et al. on the Recognition of Speech, with , 2008 .
[54] Emilia Gómez,et al. Monoaural Audio Source Separation Using Deep Convolutional Neural Networks , 2017, LVA/ICA.
[55] Mark D. Plumbley,et al. Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network , 2015, LVA/ICA.
[56] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[57] Simon Lui,et al. Implementation and Evaluation of Real-Time Interactive User Interface Design in Self-learning Singing Pitch Training Apps , 2014, ICMC.
[58] Hiromasa Fujihara,et al. A Modeling of Singing Voice Robust to Accompaniment Sounds and Its Application to Singer Identification and Vocal-Timbre-Similarity-Based Music Information Retrieval , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[59] Yi Ma,et al. The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.
[60] Mark D. Plumbley,et al. Single Channel Audio Source Separation using Deep Neural Network Ensembles , 2016 .
[61] Matthias Mauch,et al. MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.
[62] Jan Schlüter,et al. Learning to Pinpoint Singing Voice from Weakly Labeled Examples , 2016, ISMIR.
[63] Simon Dixon,et al. Adversarial Semi-Supervised Audio Source Separation Applied to Singing Voice Extraction , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[64] Alan V. Oppenheim,et al. Discrete-Time Signal Pro-cessing , 1989 .
[65] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[66] Guillaume Lemaitre,et al. Real-time Polyphonic Music Transcription with Non-negative Matrix Factorization and Beta-divergence , 2010, ISMIR.
[67] Seong Joon Oh,et al. Exploiting Saliency for Object Segmentation from Image Level Labels , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Ye Wang,et al. LyricAlly: automatic synchronization of acoustic musical signals and textual lyrics , 2004, MULTIMEDIA '04.
[69] Antoine Liutkus,et al. The 2016 Signal Separation Evaluation Campaign , 2017, LVA/ICA.
[70] Charu C. Aggarwal,et al. Neural Networks and Deep Learning , 2018, Springer International Publishing.
[71] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[72] Shankar Vembu,et al. Separation of Vocals from Polyphonic Audio Recordings , 2005, ISMIR.
[73] Bryan Pardo,et al. REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.