Transcription of the Singing Melody in Polyphonic Music

This paper proposes a method for the automatic transcription of singing melodies in polyphonic music. The method is based on multiple-F0 estimation followed by acoustic and musicological modeling. The acoustic model consists of separate models for singing notes and for no-melody segments. The musicological model uses key estimation and note bigrams to determine the transition probabilities between notes. Viterbi decoding produces a sequence of notes and rests as a transcription of the singing melody. The performance of the method is evaluated using the RWC popular music database for which the recall rate was 63% and precision rate 46%. A significant improvement was achieved compared to a baseline method fromMIREX05 evaluations.

[1]  Anssi Klapuri,et al.  Signal Processing Methods for Music Transcription , 2006 .

[2]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  David Laurenson,et al.  Estimating clean speech thresholds for perceptual based speech enhancement , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[4]  Rui Pedro Paiva On the Detection of Melody Notes in Polyphonic Audio , 2005, ISMIR.

[5]  Guy J. Brown,et al.  Extracting Melody Lines From Complex Audio , 2004, ISMIR.

[6]  Daniel P. W. Ellis,et al.  A Classification Approach to Melody Transcription , 2005, ISMIR.

[7]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[8]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[9]  A.P. Klapuri,et al.  A perceptually motivated multiple-F0 estimation method , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[10]  Karin Dressler EXTRACTION OF THE MELODY PITCH CONTOUR FROM POLYPHONIC AUDIO , 2005 .

[11]  Anssi Klapuri,et al.  Modelling of note events for singing transcription , 2004, SAPA@INTERSPEECH.

[12]  M. Marolt Audio Melody Extraction Based on Timbral Similarity of Melodic Fragments , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[13]  J. Sundberg The perception of singing. , 1999 .

[14]  M.P. Ryynanen,et al.  Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..