A comparative analysis of time-frequency decompositions in polyphonic pitch estimation

In a monaural polyphonic music context, time-frequency information used by most of the multiple fundamental frequency estimation systems, extracted from temporal-domain of the polyphonic signal, is mainly computed using fixed-resolution or variable resolution time-frequency decompositions. This time-frequency information is crucial in the polyphonic estimation process because it must clearly represent all useful information in order to find the set of active pitches. In this paper, we present a preliminary study analyzing two different decompositions, Constant Q Transform and Short Time Fourier Transform, which are integrated in the same multiple fundamental frequency estimation system, with the aim of determining what decomposition is more suitable for polyphonic musical signal analysis and how each of them influences in the accuracy results of the polyphonic estimation considering low-middle-high frequency evaluation.

[1]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[2]  Mark B. Sandler,et al.  Automatic Piano Transcription Using Frequency and Time-Domain Information , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Hirokazu Kameoka,et al.  A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  F. J. Cañadas Quesada,et al.  A Multiple-F0 Estimation Approach Based on Gaussian Spectral Modelling for Polyphonic Music Transcription , 2010 .

[5]  Thomas Sikora,et al.  Monaural Source Separation from Musical Mixtures Based on Time-Frequency Timbre Models , 2007, ISMIR.

[6]  J. J. Carabias-Orti,et al.  Note-event Detection in Polyphonic Musical Signals based on Harmonic Matching Pursuit and Spectral Smoothness , 2008 .

[7]  Matija Marolt,et al.  A connectionist approach to automatic transcription of polyphonic piano music , 2004, IEEE Transactions on Multimedia.

[8]  DeLiang Wang,et al.  Monaural Musical Sound Separation Based on Pitch and Common Amplitude Modulation , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Axel Röbel,et al.  Multiple fundamental frequency estimation of polyphonic music signals , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10]  Mark R. Every,et al.  Separation of synchronous pitched notes by spectral filtering of harmonics , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[12]  Daniel P. W. Ellis,et al.  A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..

[13]  Hirokazu Kameoka,et al.  Specmurt Analysis of Polyphonic Music Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[15]  Paris Smaragdis Relative-pitch tracking of multiple arbitrary sounds. , 2009, The Journal of the Acoustical Society of America.

[16]  Nicolás Ruiz-Reyes,et al.  Polyphonic Piano Transcription Based on Spectral Separation , 2008 .

[17]  Roland Badeau,et al.  Automatic transcription of piano music based on HMM tracking of jointly-estimated pitches , 2008, 2008 16th European Signal Processing Conference.