An open source tool for semi-automatic rhythmic annotation

ABSTRACTWe present a plugin implementation for the multi-platform Wave-Surfer sound editor. Added functionalities are the semi-automaticextraction of beats at diverse levels of the metrical hierarchy aswellasuploading and downloading functionalities toamusic meta-data database. It is built upon existing open source (GPL-licenced)audio processing tools, namely WaveSurfer, BeatRoot and CLAM,in the intent to expand the scope of those softwares. It is thereforealso provided as GPL code with the explicit goal that researchersin the audio processing community can freely use and improve it.We provide technical details of the implementation as well aspractical use cases. We also motivate the use of rhythmic metadatain Music Information Retrieval scenarios.1. INTRODUCTIONRhythm is a fundamental musical feature. Anyone perceives rhy-thm while enjoying music listening. One can represent rhythmexplicitly (i.e. write it down) in many ways, with diverse degreesof detail [1] and by different means, manually or automatically.For instance, a trained listener can transcribe a musical piece intoscore notation while listening repeatedly to it. He can also assigna single value for the basic tempo (in BPM). The level of detail inthe representation depends on the purpose of annotation. That is,different applications require different representations [1].In any case, it is clear that the task of associating such meta-data to musical pieces would be eased by the use of additionalsoftware tools. For instance, a simple sound editor plotting wave-form and spectrogram would be highly informative to a potentialuser. Also, a system that would compute automatically the desiredmetadata would obviously be relevant. However, in this case, sub-sequent human corrections are a must. Further, as it is clear thatno automatic rhythm description system is perfect, nor human an-notations are error-free, interactive systems are highly desirable.In such systems, either the user or an algorithm does a first roughanalysis of (part of) the data, then the other uses the results of thisanalysis to orient its own analysis; the process can be iterated sev-eral times.Very few beat annotation systems exist. In [2], Goto refers toa “beat-position editor.” This is a manual beat annotation tool thatprovides waveform visualisation and, for accurate annotations, au-dio feedback in the form of short bursts of noise added at beattimes. To our knowledge, the only publically available (and open-source) beat annotation software is BeatRoot [3]. To lower the an-notation effort, an automatic beat tracking algorithm is available.Interactivity resides in that the user’s corrections to the algorithmoutput (the beat times) are fed back as inputs to the very algorithm.In this paper, we report on a system built upon BeatRoot aswell as other open source audio processing tools, namely Wave-Surfer [4] and CLAM (both part of the AGNULA GPLdistributionof Linux sound software). The intent is to “take the best of sev-eral worlds”, that is, group useful functionalities of those differentsoftwares in a single application as well as expand their scope andcapabilities.We focus on a particular kind of rhythmic annotations, themetrical structure, as it has been formalised by Lerdahl and Jack-endoff in the Generative Theory of Tonal Music [5]. That is, themetadata we propose to associate to musical signals are particulartime points: the beats, at several metrical levels.Annotations can be stored locally and, when correct, they caneasily be uploaded to a distant repository, e.g. a structured musicalmetadata database such as the MTG database [6], via the SOAPprotocol.2. APPLICATIONSThe knowledge of beats at different levels of the metrical hierarchycan be useful in many applications.In Music Information Retrieval research, metadata associatedto musical data are very useful. First of all because a databaseof “ground truth” metadata greatly facilitates the design of auto-matic algorithms for audio content description. In addition, somerecent work in this field includes rhythmic information as input tosystems that compute other types of metadata. For instance, beatsat a metrical level can be used to determine other metrical levels[7], [8], [9]. They can also be useful as audio segment bound-aries for instrument classification, such as percussion [10], [11],[12]. Other examples are the use of the metrical structure for long-term segmentations and rhythmic complexity computation. How-ever, reliable determination of such information from automaticsystems is itself a challenge. It is therefore clear that in this typeof research, semi-automatic systems would be desirable.Performers’ choices in tempo and expressive timing with re-spect to position in the metrical structure are very relevant to Musi-cal Performance research. Software tools that ease the annotationof the whole metrical structure, and that generate tempo or timingdeviation curves are clearly useful in this field [13].Finally, other applications are the synchronisation or the se-quencing of several musical excerpts, the determination of “loop-ing points” for cut-and-paste operations, the application of tempo-synchronous audio effects (or visual animations), music identifica-tion, rhythmic expressiveness transformations.DAFX-1