Multipitch Analysis with Harmonic Nonnegative Matrix Approximation

This paper presents a new approach to multipitch analysis by utilizing the Harmonic Nonnegative Matrix Approximation, a harmonically-constrained and penalized version of the Nonnegative Matrix Approximation (NNMA) method. It also includes a description of a note onset, offset and amplitude retrieval procedure based on that technique. Compared with the previous NNMA approaches, specific initialization of the basis matrix is employed – the basis matrix is initialized with zeros everywhere but at positions corresponding to harmonic frequencies of consequent notes of the equal temperament scale. This results in the basis containing nothing but harmonically structured vectors, even after the learning process, and the activity matrix’s rows containing peaks corresponding to note onset times and amplitudes. Furthermore, additional penalties of mutual uncorrelation and sparseness of rows are placed upon the activity matrix. The proposed method is able to uncover the underlying musical structure better than the previous NNMA approaches and makes the note detection process very straightforward.

[1]  S. Sra Nonnegative Matrix Approximation: Algorithms and Applications , 2006 .

[2]  Inderjit S. Dhillon,et al.  Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[3]  Lawrence K. Saul,et al.  Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization , 2004, NIPS.

[4]  Mark D. Plumbley,et al.  Polyphonic music transcription by non-negative sparse coding of power spectra , 2004 .

[5]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[6]  A. Cichocki,et al.  MEASURING SPARSENESS OF NOISY SIGNALS , 2003 .

[7]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[8]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[9]  Hirokazu Kameoka,et al.  A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Stan Z. Li,et al.  Learning spatially localized, parts-based representation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Philippe Lepain Polyphonic Pitch Extraction from Musical Signals , 1999 .

[12]  Anssi Klapuri,et al.  Automatic Music Transcription as We Know it Today , 2004 .

[13]  Mark D. Plumbley,et al.  Unsupervised analysis of polyphonic music by sparse coding , 2006, IEEE Transactions on Neural Networks.

[14]  Mikkel N. Schmidt,et al.  Sparse Non-negative Matrix Factor 2-D Deconvolution , 2006 .