论文信息 - Complex NMF with the generalized Kullback-Leibler divergence

Complex NMF with the generalized Kullback-Leibler divergence

We previously introduced a phase-aware variant of the non-negative matrix factorization (NMF) approach for audio source separation, which we call the “Complex NMF (CNMF).” This approach makes it possible to realize NMF-like signal decompositions in the complex time-frequency domain. One limitation of the CNMF framework is that the divergence measure is limited to only the Euclidean distance. Some previous studies have revealed that for source separation tasks with NMF, the generalized Kullback-Leibler (KL) divergence tends to yield higher accuracy than when using other divergence measures. This motivated us to believe that CNMF could achieve even greater source separation accuracy if we could derive an algorithm for a KL divergence counterpart of CNMF. In this paper, we start by defining the notion of the “dual” form of the CNMF formulation, derived from the original Euclidean CNMF, and show that a KL divergence counterpart of CNMF can be developed based on this dual formulation. We call this “KL-CNMF”. We further derive a convergence-guaranteed iterative algorithm for KL-CNMF based on a majorization-minimization scheme. The source separation experiments revealed that the proposed KL-CNMF yielded higher accuracy than the Euclidean CNMF and NMF with varying divergences.

Hirokazu Kameoka | Masahiro Yukawa | Hideaki Kagami

[1] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] Antoine Liutkus,et al. Cauchy nonnegative matrix factorization , 2015, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[3] DeLiang Wang,et al. Towards Scaling Up Classification-Based Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[4] Hirokazu Kameoka,et al. Complex NMF: A new sparse representation for acoustic signals , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5] Inderjit S. Dhillon,et al. Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[6] Paris Smaragdis,et al. Optimal cost function and magnitude power for NMF-based speech separation and music interpolation , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[7] Irfan A. Essa,et al. Phase-Aware Non-negative Spectrogram Factorization , 2007, ICA.

[8] Andrzej Cichocki,et al. Csiszár's Divergences for Non-negative Matrix Factorization: Family of New Algorithms , 2006, ICA.

[9] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[10] Jérôme Idier,et al. Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[11] James M. Ortega,et al. Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[12] Bhiksha Raj,et al. Supervised and Semi-supervised Separation of Sounds from Single-Channel Mixtures , 2007, ICA.

[13] D. Hunter,et al. Quantile Regression via an MM Algorithm , 2000 .

[14] Derry Fitzgerald,et al. On the use of the beta divergence for musical source separation , 2009 .

[15] Jonathan Le Roux,et al. Discriminative NMF and its application to single-channel source separation , 2014, INTERSPEECH.

[16] H. Kameoka,et al. Convergence-guaranteed multiplicative algorithms for nonnegative matrix factorization with β-divergence , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.