论文信息 - End-to-End Optical Music Recognition Using Neural Networks

End-to-End Optical Music Recognition Using Neural Networks

This work addresses the Optical Music Recognition (OMR) task in an end-to-end fashion using neural networks. The proposed architecture is based on a Recurrent Convolutional Neural Network topology that takes as input an image of a monophonic score and retrieves a sequence of music symbols as output. In the first stage, a series of convolutional filters are trained to extract meaningful features of the input image, and then a recurrent block models the sequential nature of music. The system is trained using a Connectionist Temporal Classification loss function, which avoids the need for a frame-by-frame alignment between the image and the ground-truth music symbols. Experimentation has been carried on a set of 90,000 synthetic monophonic music scores with more than 50 different possible labels. Results obtained depict classification error rates around 2 % at symbol level, thus proving the potential of the proposed end-to-end architecture for OMR. The source code, dataset, and trained models are publicly released for reproducible research and future comparison purposes.

Jorge Calvo-Zaragoza | Jose J. Valero-Mas | Antonio Pertusa

[1] Alejandro Héctor Toselli,et al. Multimodal interactive transcription of text images , 2010, Pattern Recognit..

[2] Laurent Pugin,et al. Optical Music Recognitoin of Early Typographic Prints using Hidden Markov Models , 2006, ISMIR.

[3] Gilson A. Giraldi,et al. Music Score Binarization Based on Domain Knowledge , 2011, IbPRIA.

[4] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5] Timothy C. Bell,et al. The Challenge of Optical Music Recognition , 2001, Comput. Humanit..

[6] Jaime S. Cardoso,et al. Optical recognition of music symbols - A comparative study , 2010, Int. J. Document Anal. Recognit..

[7] Alejandro Héctor Toselli,et al. Sheet Music Statistical Layout Analysis , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[8] Eric Nichols,et al. Lyric Extraction and Recognition on Digital Images of Early Music Sources , 2009, ISMIR.

[9] Gregory Burlet,et al. Optical Measure Recognition in Common Music Notation , 2013, ISMIR.

[10] Simon Dixon,et al. An End-to-End Neural Network for Polyphonic Piano Music Transcription , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[11] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[12] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[13] Thierry Géraud,et al. A morphological method for music score staff removal , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14] José Oncina,et al. An efficient approach for Interactive Sequential Pattern Recognition , 2017, Pattern Recognit..

[15] Hermann Ney,et al. Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[16] Carlos Guedes,et al. Optical music recognition: state-of-the-art and open issues , 2012, International Journal of Multimedia Information Retrieval.

[17] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[19] Ichiro Fujinaga,et al. A Comparative Study of Staff Removal Algorithms , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Jorge Calvo-Zaragoza,et al. Early Handwritten Music Recognition with Hidden Markov Models , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[21] T. Munich,et al. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[22] Jun Ohya,et al. Automatic Recognition of Square Notation Symbols in Western Plainchant Manuscripts , 2014 .