Towards an Efficient Algorithm for Automatic Score-to-Audio Synchronization

In the last few years, several algorithms for the automatic alignment of audio and score data corresponding to the same piece of music have been proposed. Among the major drawbacks to these approaches are the long running times as well as the large memory requirements. In this paper we present an algorithm, which solves the synchronization problem accurately and efficiently for complex, polyphonic piano music. In a first step, we extract from the audio data stream a set of highly expressive features encoding note onset candidates separately for all pitches. This makes computations efficient since only a small number of such features is sufficient to solve the synchronization task. Based on a suitable matching model, the best match between the score and the feature parameters is computed by dynamic programming (DP). To further cut down the computational cost in the synchronization process, we introduce the concept of anchor matches, matches which can be easily established. Then the DP-based technique is locally applied between adjacent anchor matches. Evaluation results have been obtained on complex polyphonic piano pieces including Chopin’s Etudes Op. 10.