论文信息 - Incremental approach to NMF basis estimation for audio source separation

Incremental approach to NMF basis estimation for audio source separation

Nonnegative matrix factorization (NMF) is a matrix factorization technique that might find meaningful latent nonnegative components. Since, however, the objective function is non-convex, the source separation performance can degrade when the iterative update of the basis matrix is stuck to a poor local minimum. Most of the research updates basis iteratively to minimize certain objective function with random initialization, although a few approaches have been proposed for the systematic initialization of the basis matrix such as the singular value decomposition. In this paper, we propose a novel basis estimation method inspired by the similarity of the bases training with the vector quantization, which is similar to Linde-Buzo-Gray algorithm. Experiments of the audio source separation showed that the proposed method outperformed the NMF using random initialization by about 1.64 dB and 1.43 dB in signal-to-distortion ratio when its target sources were speech and violin, respectively.

Nam Soo Kim | Jong Won Shin | Kisoo Kwon | In Kyu Choi | Hyung Yong Kim

[1] Christos Boutsidis,et al. SVD based initialization: A head start for nonnegative matrix factorization , 2008, Pattern Recognit..

[2] Geoffrey J. Gordon,et al. A Unified View of Matrix Factorization Models , 2008, ECML/PKDD.

[3] Paris Smaragdis,et al. Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[4] Paris Smaragdis,et al. Convolutive Speech Bases and Their Application to Supervised Speech Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5] Bhiksha Raj,et al. Compositional Models for Audio Processing: Uncovering the structure of sound mixtures , 2015, IEEE Signal Processing Magazine.

[6] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[7] Asoke K. Nandi,et al. An enhanced initialization method for non-negative matrix factorization , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[8] Guillermo Sapiro,et al. Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[9] Paris Smaragdis,et al. Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view , 2014, IEEE Signal Processing Magazine.

[10] Bhiksha Raj,et al. Regularized non-negative matrix factorization with temporal dependencies for speech denoising , 2008, INTERSPEECH.

[11] Stefan M. Wild. Seeding Non-Negative Matrix Factorizations with the Spherical K-Means Clustering , 2003 .

[12] Tuomas Virtanen,et al. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13] Nam Soo Kim,et al. Target Source Separation Based on Discriminative Nonnegative Matrix Factorization Incorporating Cross-Reconstruction Error , 2015, IEICE Trans. Inf. Syst..

[14] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15] Barak A. Pearlmutter,et al. Convolutive Non-Negative Matrix Factorisation with a Sparseness Constraint , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[16] Nam Soo Kim,et al. NMF-Based Speech Enhancement Using Bases Update , 2015, IEEE Signal Processing Letters.

[17] Stefan M. Wild,et al. Improving non-negative matrix factorizations through structured initialization , 2004, Pattern Recognit..

[18] Amy Nicole Langville,et al. Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization , 2014, ArXiv.

[19] Robert M. Gray,et al. An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[20] J. Larsen,et al. Wind Noise Reduction using Non-Negative Sparse Coding , 2007, 2007 IEEE Workshop on Machine Learning for Signal Processing.

[21] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[22] Jie Yang,et al. Initialization enhancer for non-negative matrix factorization , 2007, Eng. Appl. Artif. Intell..

[23] Jonathan Le Roux,et al. Discriminative NMF and its application to single-channel source separation , 2014, INTERSPEECH.

[24] Methods for objective and subjective assessment of quality Perceptual evaluation of speech quality ( PESQ ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs , 2002 .

[25] Zhigang Luo,et al. Online Nonnegative Matrix Factorization With Robust Stochastic Approximation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[26] Chih-Jen Lin,et al. Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[27] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[28] Arne Leijon,et al. A new linear MMSE filter for single channel speech enhancement based on Nonnegative Matrix Factorization , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).