Single Channel Beatbox Music Separation Using Non-negative Matrix Factorisation

In this work, Non-negative Matrix Factorization (NMF) based single-channel beatbox music separation with spectral masking is proposed. Beatbox is an instrument like music entirely produced by the human voice. We have a mixed-signal of two musical beat sources. The algorithm uses data training of these two different beat sources with the NMF technique followed by spectral masking to separate an individual beat source out of the mixed-signal. During the training stage, we have used NMF in the magnitude spectrum domain to train the set of basis vectors for each beat source. After the observation of the mixed-signal, its magnitude spectra are decomposed into a linear combination of the trained basis for both beat sources. With the help of these decomposed spectrograms, masking is done which determines the role of each beat source in the mixed-signal and that is how we can separate and play a particular beat or a mixture of beats as per our musical sense and hence more versatile music can be created.

[1]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[2]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[4]  Paris Smaragdis,et al.  Convolutive Speech Bases and Their Application to Supervised Speech Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Björn W. Schuller,et al.  A comparative study on sparsity penalties for NMF-based speech separation: Beyond LP-norms , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Paris Smaragdis,et al.  Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view , 2014, IEEE Signal Processing Magazine.

[7]  Mohd Wajid,et al.  Estimation of Mixing Coefficients using the Joint Probability Distribution of Mixed Signals , 2019, IETE Journal of Research.

[8]  Nam Soo Kim,et al.  Target Source Separation Based on Discriminative Nonnegative Matrix Factorization Incorporating Cross-Reconstruction Error , 2015, IEICE Trans. Inf. Syst..

[9]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[10]  Bhiksha Raj,et al.  Non-negative Hidden Markov Modeling of Audio with Application to Source Separation , 2010, LVA/ICA.

[11]  M.E. Davies,et al.  Source separation using single channel ICA , 2007, Signal Process..

[12]  Dietrich Lehmann,et al.  Nonsmooth nonnegative matrix factorization (nsNMF) , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Nam Soo Kim,et al.  NMF-Based Speech Enhancement Using Bases Update , 2015, IEEE Signal Processing Letters.

[14]  Mohd Wajid,et al.  Digital Image Separation Algorithm Based on Joint PDF of Mixed Images , 2015 .

[15]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[16]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[17]  Bhiksha Raj,et al.  Speech denoising using nonnegative matrix factorization with priors , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Arne Leijon,et al.  A new linear MMSE filter for single channel speech enhancement based on Nonnegative Matrix Factorization , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[19]  Nam Soo Kim,et al.  NMF-based Target Source Separation Using Deep Neural Network , 2015, IEEE Signal Processing Letters.

[20]  Raymond H. Myers,et al.  Probability and Statistics for Engineers and Scientists. , 1973 .