A discriminative approach to polyphonic piano note transcription using supervised non-negative matrix factorization

We introduce a novel method for the transcription of polyphonic piano music by discriminative training of support vector machines (SVMs). As features, we use pitch activations computed by supervised non-negative matrix factorization from low-level spectral features. Different approaches to low-level feature extraction, NMF dictionary learning and activation feature extraction are analyzed in a large-scale evaluation on eight hours of piano music including synthesized and real recordings. We conclude that the proposed method delivers state-of-the-art results and clearly outperforms SVMs using simple spectral features.

[1]  Emmanuel Vincent,et al.  Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Daniel P. W. Ellis,et al.  A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..

[4]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[5]  Juhan Nam,et al.  A Classification-Based Polyphonic Piano Transcription Approach Using Learned Feature Representations , 2011, ISMIR.

[6]  Jouni Paulus,et al.  Drum transcription with non-negative spectrogram factorisation , 2005, 2005 13th European Signal Processing Conference.

[7]  Mark D. Plumbley,et al.  Unsupervised analysis of polyphonic music by sparse coding , 2006, IEEE Transactions on Neural Networks.

[8]  Markus Schedl,et al.  Polyphonic piano note transcription with recurrent neural networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Matija Marolt,et al.  A connectionist approach to automatic transcription of polyphonic piano music , 2004, IEEE Transactions on Multimedia.

[10]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[11]  Mark D. Plumbley,et al.  Polyphonic music transcription by non-negative sparse coding of power spectra , 2004 .

[12]  E. Vincent,et al.  TWO NONNEGATIVE MATRIX FACTORIZATION METHODS FOR POLYPHONIC PITCH TRANSCRIPTION , 2007 .

[13]  Simon Dixon,et al.  On the Computer Recognition of Solo Piano Music , 2000 .

[14]  Björn Schuller,et al.  Automatic Transcription of Recorded Music , 2012 .

[15]  Roland Badeau,et al.  Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Giovanni Saggio,et al.  Automatic music transcription based on non-negative matrix factorization , 2010 .

[17]  Rainer Lienhart,et al.  Note onset detection for the transcription of polyphonic piano music , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[18]  Anssi Klapuri,et al.  Automatic Music Transcription: Breaking the Glass Ceiling , 2012, ISMIR.

[19]  Christian Schörkhuber CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING , 2010 .

[20]  Meinard Müller,et al.  Using score-informed constraints for NMF-based source separation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Rahim Saeidi,et al.  Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition , 2012, INTERSPEECH.

[22]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[23]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[24]  Björn W. Schuller,et al.  Real-Time Speech Separation by Semi-supervised Nonnegative Matrix Factorization , 2012, LVA/ICA.

[25]  Seokjin Lee,et al.  Automatic music transcription using Non-negative matrix factorization , 2010 .

[26]  Mark D. Plumbley,et al.  Polyphonic transcription by non-negative sparse coding of power spectra , 2004, ISMIR.