论文信息 - Nonnegative dictionary learning in the exponential noise model for adaptive music signal representation

Nonnegative dictionary learning in the exponential noise model for adaptive music signal representation

In this paper we describe a maximum likelihood approach for dictionary learning in the multiplicative exponential noise model. This model is prevalent in audio signal processing where it underlies a generative composite model of the power spectrogram. Maximum joint likelihood estimation of the dictionary and expansion coefficients leads to a nonnegative matrix factorization problem where the Itakura-Saito divergence is used. The optimality of this approach is in question because the number of parameters (which include the expansion coefficients) grows with the number of observations. In this paper we describe a variational procedure for optimization of the marginal likelihood, i.e., the likelihood of the dictionary where the activation coefficients have been integrated out (given a specific prior). We compare the output of both maximum joint likelihood estimation (i.e., standard Itakura-Saito NMF) and maximum marginal likelihood estimation (MMLE) on real and synthetical datasets. The MMLE approach is shown to embed automatic model order selection, akin to automatic relevance determination.

Onur Dikmen | Cédric Févotte

[1] Andrew Zisserman,et al. Advances in Neural Information Processing Systems (NIPS) , 2007 .

[2] John F. Canny,et al. GaP: a factor model for discrete data , 2004, SIGIR '04.

[3] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[4] David Mackay,et al. Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[5] Éric Gaussier,et al. Relation between PLSA and NMF and implications , 2005, SIGIR '05.

[6] D. Hunter,et al. A Tutorial on MM Algorithms , 2004 .

[7] Onur Dikmen,et al. Maximum marginal likelihood estimation for nonnegative dictionary learning , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8] Victoria Stodden,et al. When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts? , 2003, NIPS.

[9] Ali Taylan Cemgil,et al. Nonnegative matrix factorizations as probabilistic inference in composite models , 2009, 2009 17th European Signal Processing Conference.

[10] Aleks Jakulin,et al. Discrete Component Analysis , 2005, SLSFS.

[11] Yu Cao,et al. Cross Burg entropy maximization and its application to ringing suppression in image reconstruction , 1999, IEEE Trans. Image Process..