论文信息 - Coupled tensor factorization models for polyphonic music transcription

Coupled tensor factorization models for polyphonic music transcription

Generalized Coupled Tensor Factorization (GCTF) is a recently proposed algorithmic framework for simultaneously estimating tensor factorization models where several tensors can share a set of latent factors. This paper presents two models in this framework for transcribing polyphonic piano pieces. The first model is based on Non-negative Matrix Factorization where the coupling provides the spectral information to the model. As an extension to the first model, the second model incorporates temporal and harmonic information by taking a rough, incomplete transciption of the piece as input. Incorporating harmonic knowledge improves the transcription quality as the the experimental results show that we get around 23 % F-measure improvement on real piano data.

Ali Taylan Cemgil | Umut Simsekli | Yusuf Kenan Yilmaz

[1] Simon J. Godsill,et al. Generative Spectrogram Factorization Models for Polyphonic Piano Transcription , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Ali Taylan Cemgil,et al. Probabilistic Latent Tensor Factorization , 2010, LVA/ICA.

[3] Ali Taylan Cemgil,et al. Generalised Coupled Tensor Factorisation , 2011, NIPS.

[4] Ali Taylan Cemgil,et al. Score guided audio restoration via generalised coupled tensor factorisation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[6] Roland Badeau,et al. Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Ali Taylan Cemgil,et al. Probabilistic latent tensor factorization framework for audio modeling , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[8] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[9] Emmanuel Vincent,et al. Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.