Optical Music Recognition by Recurrent Neural Networks

Optical Music Recognition is the task of transcribing a music score into a machine readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Eric Nichols,et al.  Lyric Extraction and Recognition on Digital Images of Early Music Sources , 2009, ISMIR.

[3]  Alicia Fornés,et al.  Towards the Recognition of Compound Music Notes in Handwritten Music Scores , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[4]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[5]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Timothy C. Bell,et al.  The Challenge of Optical Music Recognition , 2001, Comput. Humanit..

[7]  Carlos Guedes,et al.  Optical music recognition: state-of-the-art and open issues , 2012, International Journal of Multimedia Information Retrieval.