论文信息 - Does inharmonicity improve an NMF-based piano transcription model?

Does inharmonicity improve an NMF-based piano transcription model?

This paper investigates how precise a model should be for a robust model-based NMF analysis of piano recordings. While inharmonicity is an essential feature of piano tones from a perceptual point of view, its explicit inclusion in sound models is not straightforward and may even damage the quality of the analysis. Here, we assess the quality of the analysis with a transcription task, and compare three different models for the spectra of the dictionary: one strictly harmonic, one following the theoretical inharmonicity law, and one with relaxed inharmonicity constraints. Experimental results show that both inharmonic models can indeed significantly enhance the results, but only in the case when a good initialization is provided.

Laurent Daudet | Bertrand David | Antoine Falaize | François Rigaud

[1] Bernhard Niedermayer. Non-Negative Matrix Division for the Automatic Transcription of Polyphonic Music , 2008, ISMIR.

[2] Bhiksha Raj,et al. Adobe Systems , 1998 .

[3] S. Schwerman,et al. The Physics of Musical Instruments , 1991 .

[4] Tuomas Virtanen,et al. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5] Laurent Daudet,et al. Piano sound analysis using Non-negative Matrix Factorization with inharmonicity constraint , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[6] Roland Badeau,et al. NMF With Time–Frequency Activations to Model Nonstationary Audio Events , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Gabriel Weinreich,et al. Coupled piano strings , 1977 .

[8] Emmanuel Vincent,et al. Adaptive Harmonic Spectral Decomposition for Multiple Pitch Estimation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[9] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[10] Gaël Richard,et al. Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[11] Emmanuel Vincent,et al. Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12] Hirokazu Kameoka,et al. Explicit beat structure modeling for non-negative matrix factorization-based multipitch analysis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13] Alexander Dekhtyar,et al. Information Retrieval , 2018, Lecture Notes in Computer Science.

[14] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[15] Bertrand David,et al. A Parametric Model of Piano Tuning , 2011 .

[16] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[17] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[18] Roland Badeau,et al. Time-dependent parametric and harmonic templates in non-negative matrix factorization , 2010 .

[19] Emmanuel Vincent,et al. Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[20] Guillaume Lemaitre,et al. Real-time Polyphonic Music Transcription with Non-negative Matrix Factorization and Beta-divergence , 2010, ISMIR.