Frame-Level Audio Segmentation for Abridged Musical Works

Large-scale musical works such as operas may last several hours and typically involve a huge number of musicians. For such compositions, one often finds different arrangements and abridged versions (often lasting less than an hour), which can also be performed by smaller ensembles. Abridged versions still convey the flavor of the musical work containing the most important excerpts and melodies. In this paper, we consider the task of automatically segmenting an audio recording of a given version into semantically meaningful parts. Following previous work, the general strategy is to transfer a reference segmentation of the original complete work to the given version. Our main contribution is to show how this can be accomplished when dealing with strongly abridged versions. To this end, opposed to previously suggested segment-level matching procedures, we adapt a frame-level matching approach for transferring the reference segment information to the unknown version. Considering the opera “Der Freischutz” as an example scenario, we discuss how to balance out flexibility and robustness properties of our proposed framelevel segmentation procedure.