论文信息 - Towards Score Following In Sheet Music Images - 字舞流文

Towards Score Following In Sheet Music Images

This paper addresses the matching of short music audio snippets to the corresponding pixel location in images of sheet music. A system is presented that simultaneously learns to read notes, listens to music and matches the currently played music to its corresponding notes in the sheet. It consists of an end-to-end multi-modal convolutional neural network that takes as input images of sheet music and spectrograms of the respective audio snippets. It learns to predict, for a given unseen audio snippet (covering approximately one bar of music), the corresponding position in the respective score line. Our results suggest that with the use of (deep) neural networks -- which have proven to be powerful image processing models -- working with sheet music becomes feasible and a promising future research direction.

Gerhard Widmer | Andreas Arzt | Matthias Dorfer | G. Widmer | A. Arzt | Matthias Dorfer

[1] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[2] Youngmoo E. Kim,et al. Orchestral Performance Companion: Using Real-Time Audio to Score Alignment , 2013, IEEE MultiMedia.

[3] Jordi Janer,et al. Audio-to-score Alignment at the Note Level for Orchestral Recordings , 2014, ISMIR.

[4] Arshia Cont,et al. A Coupled Duration-Focused Architecture for Real-Time Music-to-Score Alignment , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[6] Colin Raffel,et al. Lasagne: First release. , 2015 .

[7] Yoshua Bengio,et al. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[8] Mark S. Melenhorst,et al. A Tablet App to Enrich the Live and Post-Live Experience of Classical Concerts , 2015, WSICC@TVX.

[9] Jenn Riley,et al. Variations2: retrieving and using music in an academic setting , 2006, CACM.

[10] Meinard Müller,et al. Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[11] Florian Krebs,et al. madmom: A New Python Audio and Music Signal Processing Library , 2016, ACM Multimedia.

[12] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[13] Bryan Pardo,et al. A state space model for online polyphonic audio-score alignment , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14] Meinard Müller,et al. Linking Sheet Music and Audio - Challenges and New Approaches , 2012, Multimodal Music Processing.

[15] Christopher Raphael,et al. Music Plus One and Machine Learning , 2010, ICML.

[16] Özgür Izmirli,et al. Bridging Printed Music and Audio Through Alignment Using a Mid-level Score Representation , 2012, ISMIR.

[17] Gerhard Widmer,et al. Automatic Page Turning for Musicians via Real-Time Machine Listening , 2008, ECAI.

[18] Nicholas Cook. Performance Analysis and Chopin's Mazurkas , 2007 .

[19] Gerhard Widmer,et al. A Multi-pass Algorithm for Accurate Audio-to-Score Alignment , 2010, ISMIR.

[20] Gerhard Widmer,et al. Artificial Intelligence in the Concertgebouw , 2015, IJCAI.