Overlapped event-note separation based on partials amplitude and phase estimation for polyphonic music transcription

We propose a discriminative model for polyphonic music transcription that deals with the well-known overlapped partial problem by taking into account the instrument envelope pattern for each note. The process to obtain the music scene-adaptive envelope patterns for each note is detailed. Firstly, spectral features are obtained individually for each note. Then, support vector machines (SVM) are trained on the notes energy. We apply a scheme of one-versus-all (OVA) SVM classifiers to make an approximation of the active frame-level note instances. Finally, amplitudes and phases are estimated by considering the envelope patterns for different notes, distributing the energy according to the note estimated envelope pattern adjustment. Also, temporal information is added by introducing Hidden Markov Models. Our approach has been tested with synthesized and real music recordings, obtaining promising results.

[1]  Daniel P. W. Ellis,et al.  A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..

[2]  Ye Wang,et al.  Music transcription using an instrument model , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  Nicolás Ruiz-Reyes,et al.  Estimating Instrument Spectral Envelopes for Polyphonic Music Transcription in a Music Scene-Adaptive Approach , 2009 .

[4]  Matija Marolt,et al.  A connectionist approach to automatic transcription of polyphonic piano music , 2004, IEEE Transactions on Multimedia.

[5]  Richard Heusdens,et al.  A new psychoacoustical masking model for audio coding applications , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Masataka Goto A predominant-F/sub 0/ estimation method for CD recordings: MAP estimation using EM algorithm for adaptive tone models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[7]  Soledad Torres-Guijarro,et al.  Multiple Piano Note Identification Using a Spectral Matching Method with Derived Patterns , 2005 .

[8]  Mark B. Sandler,et al.  Automatic Piano Transcription Using Frequency and Time-Domain Information , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Anssi Klapuri,et al.  Multiple Fundamental Frequency Estimation by Summing Harmonic Amplitudes , 2006, ISMIR.

[10]  José Manuel Iñesta Quereda,et al.  Multiple fundamental frequency estimation using Gaussian smoothness , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.