Synthesized Polyphonic Music Database with Verifiable Ground Truth for Multiple F0 Estimation

To study and to evaluate a multiple F0 estimation algorithm, a polyphonic database with verifiable ground truth is necessary. Real recordings with manual annotation as ground truth are often used for evaluation. However, ambiguities arise during manual annotation, which are often set up by subjective judgements. Therefore, in order to have access to verifiable ground truth, we propose a systematic method for creating a polyphonicmusic database. Multiple monophonic tracks are rendered from a given MIDI file, in which rendered samples are separated to prevent overlaps and to facilitate automatic annotation. F0s can then be reliably extracted as ground truth, which are stored using SDIF.

[1]  X. Rodet,et al.  Sound Analysis and Processing with AudioSculpt 2 , 2004, ICMC.

[2]  Matthew Wright,et al.  Extensions and Applications of the SDIF Sound Description Interchange Format , 2000, ICMC.

[3]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[4]  Masataka Goto,et al.  RWC Music Database: Music genre database and musical instrument sound database , 2003, ISMIR.

[5]  M.P. Ryynanen,et al.  Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[6]  Axel Röbel,et al.  MULTIPLE F0 TRACKING IN SOLO RECORDINGS OF MONODIC INSTRUMENTS , 2006 .

[7]  Masataka Goto,et al.  Instrument Identification in Polyphonic Music: Feature Weighting to Minimize Influence of Sound Overlaps , 2007, EURASIP J. Adv. Signal Process..

[8]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[9]  John T. Scott,et al.  Fundamentals of musical acoustics , 1976 .

[10]  Alexis Baskind Modèles et méthodes de description spatiale de scènes sonores : application aux enregistrements binauraux , 2003 .

[11]  Masataka Goto,et al.  AIST Annotation for the RWC Music Database , 2006, ISMIR.

[12]  Xavier Rodet,et al.  Improving score to audio alignment: Percussion alignment and Precise Onset Estimation , 2004, ICMC.

[13]  DeLiang Wang,et al.  Pitch Detection in Polyphonic Music using Instrument Tone Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[14]  Hirokazu Kameoka,et al.  A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  France,et al.  Onset Detection in Polyphonic Signals by means of Transient Peak Classification , 2005 .

[16]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.