MULTIPLE-F0 TRACKING BASED ON A HIGH-ORDER HMM MODEL

This paper is about multiple-F0 tracking and the estimation of the number of harmonic source streams in music sound signals. A source stream is understood as generated from a note played by a musical instrument. A note is described by a hiddenMarkovmodel (HMM) having two states: the attack state and the sustain state. It is proposed to first perform the tracking of F0 candidates using a high-order hidden Markov model, based on a forward-backward dynamic programming scheme. The propagated weights are calculated in the forward tracking stage, followed by an iterative tracking of the most likely trajectories in the backward tracking stage. Then, the estimation of the underlying source streams is carried out by means of iteratively pruning the candidate trajectories in a maximum likelihood manner. The proposed system is evaluated by a specially constructed polyphonic music database. Compared with the frame-based estimation systems, the tracking mechanism improves significantly the accuracy rate.

[1]  Masataka Goto,et al.  RWC Music Database: Music genre database and musical instrument sound database , 2003, ISMIR.

[2]  Axel Röbel,et al.  MULTIPLE F0 TRACKING IN SOLO RECORDINGS OF MONODIC INSTRUMENTS , 2006 .

[3]  Daniel P. W. Ellis,et al.  A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..

[4]  Niels Bogaards,et al.  Synthesized Polyphonic Music Database with Verifiable Ground Truth for Multiple F0 Estimation , 2007, ISMIR.

[5]  Andrew D. Sterian,et al.  Model-based segmentation of time-frequency images for musical transcription. , 1999 .

[6]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[7]  David K. Mellinger,et al.  Event formation and separation in musical sound , 1992 .

[8]  Hirokazu Kameoka,et al.  Audio stream segregation of multi-pitch music signal based on time-space clustering using Gaussian kernel 2-dimensional model , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  M.P. Ryynanen,et al.  Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[10]  Emmanuel Vincent,et al.  Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Nicola Orio,et al.  Score Following Using Spectral Analysis and Hidden Markov Models , 2001, ICMC.

[12]  Mathieu Lagrange,et al.  Sound Source Tracking and Formation using Normalized Cuts , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[13]  Khaled H. Hamed,et al.  Time-frequency analysis , 2003 .

[14]  Hans-Paul Schwefel,et al.  Evolution and optimum seeking , 1995, Sixth-generation computer technology series.

[15]  Guy J. Brown,et al.  A multi-pitch tracking algorithm for noisy speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  A. Röbel,et al.  Adaptive noise level estimation , 2006 .

[17]  Keith D. Martin,et al.  A Blackboard System for Automatic Transcription of Simple Polyphonic Music , 1996 .

[18]  Axel Röbel,et al.  Multiple fundamental frequency estimation of polyphonic music signals , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..