论文信息 - Transcription of the Singing Melody in Polyphonic Music

Transcription of the Singing Melody in Polyphonic Music

This paper proposes a method for the automatic transcription of singing melodies in polyphonic music. The method is based on multiple-F0 estimation followed by acoustic and musicological modeling. The acoustic model consists of separate models for singing notes and for no-melody segments. The musicological model uses key estimation and note bigrams to determine the transition probabilities between notes. Viterbi decoding produces a sequence of notes and rests as a transcription of the singing melody. The performance of the method is evaluated using the RWC popular music database for which the recall rate was 63% and precision rate 46%. A significant improvement was achieved compared to a baseline method fromMIREX05 evaluations.

Anssi Klapuri | Matti Ryynänen

[1] Anssi Klapuri,et al. Signal Processing Methods for Music Transcription , 2006 .

[2] Jaakko Astola,et al. Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3] David Laurenson,et al. Estimating clean speech thresholds for perceptual based speech enhancement , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[4] Rui Pedro Paiva. On the Detection of Melody Notes in Polyphonic Audio , 2005, ISMIR.

[5] Guy J. Brown,et al. Extracting Melody Lines From Complex Audio , 2004, ISMIR.

[6] Daniel P. W. Ellis,et al. A Classification Approach to Melody Transcription , 2005, ISMIR.

[7] Masataka Goto,et al. A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[8] Masataka Goto,et al. RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[9] A.P. Klapuri,et al. A perceptually motivated multiple-F0 estimation method , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[10] Karin Dressler. EXTRACTION OF THE MELODY PITCH CONTOUR FROM POLYPHONIC AUDIO , 2005 .

[11] Anssi Klapuri,et al. Modelling of note events for singing transcription , 2004, SAPA@INTERSPEECH.

[12] M. Marolt. Audio Melody Extraction Based on Timbral Similarity of Melodic Fragments , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[13] J. Sundberg. The perception of singing. , 1999 .

[14] M.P. Ryynanen,et al. Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..