Multiple-F0 estimation of piano sounds exploiting spectral structure and temporal evolution

This paper proposes a system for multiple fundamental frequency estimation of piano sounds using pitch candidate selection rules which employ spectral structure and temporal evolution. As a time-frequency representation, the Resonator TimeFrequency Image of the input signal is employed, a noise suppression model is used, and a spectral whitening procedure is performed. In addition, a spectral flux-based onset detector is employed in order to select the steady-state region of the produced sound. In the multiple-F0 estimation stage, tuning and inharmonicity parameters are extracted and a pitch salience function is proposed. Pitch presence tests are performed utilizing information from the spectral structure of pitch candidates, aiming to suppress errors occurring at multiples and sub-multiples of the true pitches. A novel feature for the estimation of harmonically related pitches is proposed, based on the common amplitude modulation assumption. Experiments are performed on the MAPS database using 8784 piano samples of classical, jazz, and random chords with polyphony levels between 1 and 6. The proposed system is computationally inexpensive, being able to perform multiple-F0 estimation experiments in realtime. Experimental results indicate that the proposed system outperforms state-of-the-art approaches for the aforementioned task in a statistically significant manner.

[1]  Roland Badeau,et al.  Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[3]  DeLiang Wang,et al.  Monaural Musical Sound Separation Based on Pitch and Common Amplitude Modulation , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Juan Pablo,et al.  Towards the automated analysis of simple polyphonic music : a knowledge-based approach , 2003 .

[5]  Anssi Klapuri,et al.  Signal Processing Methods for Music Transcription , 2006 .

[6]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[7]  David Gunawan,et al.  Identification of Partials in Polyphonic Mixtures Based on Temporal Envelope Similarity , 2007 .

[8]  Dan Stowell,et al.  Adaptive whitening for Improved Real-Time audio onset Detection , 2007, ICMC.

[9]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[10]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[11]  Emmanuel Vincent,et al.  Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Luis I. Ortiz-Berenguer,et al.  PIANO TRANSCRIPTION USING PATTERN RECOGNITION: ASPECTS ON PARAMETER EXTRACTION , 2004 .

[13]  A. de Cheveigné Multiple F0 estimation , 2006 .

[14]  Anssi Klapuri A Method for Visualizing the Pitch Content of Polyphonic Music Signals , 2009, ISMIR.

[15]  Ruohua Zhou,et al.  Feature extraction of musical content for automatic music transcription , 2006 .

[16]  Isabelle Guyon,et al.  What Size Test Set Gives Good Error Rate Estimates? , 1998, IEEE Trans. Pattern Anal. Mach. Intell..