Pixel-wise binarization of musical documents with convolutional neural networks

Binarization is an important process in document analysis systems. Yet, it is quite difficult to devise a binarization method that perform successfully over a wide range of documents, especially in the case of digitized old musical manuscripts and scores with irregular lighting and source degradation. Our approach to binarization of musical documents is based on training a Convolutional Neural Network that classifies each pixel of the image as either background or foreground. Our results demonstrate that the approach is competitive with other state-of-the-art algorithms. It also illustrates the advantage of being able to adapt to any type of score by simply modifying the training set.

[1]  Jean-Michel Jolion,et al.  Text localization, enhancement and binarization in multimedia documents , 2002, Object recognition supported by user interaction for service robots.

[2]  Ioannis Pratikakis,et al.  ICDAR 2013 Document Image Binarization Contest (DIBCO 2013) , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[3]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[4]  Ichiro Fujinaga,et al.  A Comparative Survey of Image Binarisation Algorithms for Optical Recognition on Degraded Musical Sources , 2007, ISMIR.

[5]  Luisa Micó,et al.  Music staff removal with supervised pixel classification , 2016, International Journal on Document Analysis and Recognition (IJDAR).

[6]  Ichiro Fujinaga,et al.  A Robust Border Detection Algorithm with Application to Medieval Music Manuscripts , 2009, ICMC.

[7]  Ana M. Barbancho,et al.  Automatic search and delimitation of frontispieces in ancient scores , 2010, 2010 18th European Signal Processing Conference.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Muriel Visani,et al.  ICDAR 2013 Music Scores Competition: Staff Removal , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[10]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..

[11]  K. W. Wong,et al.  A two-stage binarization approach for document images , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[12]  Gilson A. Giraldi,et al.  Music Score Binarization Based on Domain Knowledge , 2011, IbPRIA.

[13]  Eric Nichols,et al.  Lyric Extraction and Recognition on Digital Images of Early Music Sources , 2009, ISMIR.

[14]  Gregory Burlet,et al.  Optical Measure Recognition in Common Music Notation , 2013, ISMIR.

[15]  Abderrahmane Kefali,et al.  Foreground-Background Separation by Feed-forward Neural Networks in Old Manuscripts , 2014, Informatica.

[16]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..