Automatic Music Transcription using row weighted decompositions

Automatic Music Transcription (AMT) seeks to understand a musical piece in terms of note activities. Matrix decomposition methods are often used for AMT, seeking to decompose a spectrogram over a dictionary matrix of note-specific template vectors. The performance of these methods can suffer due to the large harmonic overlap found in tonal musical spectra. We propose a row weighting scheme that transforms each spectrogram frame and the dictionary, with the weighting determined by the effective correlations in the decomposition. Experiments show improved AMT performance.

[1]  Michael Elad,et al.  On the Uniqueness of Nonnegative Sparse Solutions to Underdetermined Systems of Equations , 2008, IEEE Transactions on Information Theory.

[2]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[3]  Pierre Vandergheynst,et al.  Dictionary Preconditioning for Greedy Algorithms , 2008, IEEE Transactions on Signal Processing.

[4]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[5]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[6]  Emmanuel Vincent,et al.  Adaptive Harmonic Spectral Decomposition for Multiple Pitch Estimation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Shigeki Sagayama,et al.  Extending Nonnegative Matrix Factorization—A discussion in the context of multiple frequency estimation of musical signals , 2009, 2009 17th European Signal Processing Conference.

[8]  Abolfazl Mehbodniya,et al.  A re-weighted algorithm for designing data dependent sensing dictionary , 2011 .

[9]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[10]  Sabine Van Huffel,et al.  Overview of total least-squares methods , 2007, Signal Process..

[11]  Andrzej Cichocki,et al.  Csiszár's Divergences for Non-negative Matrix Factorization: Family of New Algorithms , 2006, ICA.

[12]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[13]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[14]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[15]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[17]  R. Bro,et al.  A fast non‐negativity‐constrained least squares algorithm , 1997 .

[18]  Emmanuel Vincent,et al.  Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Roland Badeau,et al.  Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[21]  Mark D. Plumbley,et al.  INK-SVD: Learning incoherent dictionaries for sparse representations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).