Document Image Binarization with Fully Convolutional Neural Networks

Binarization of degraded historical manuscript images is an important pre-processing step for many document processing tasks. We formulate binarization as a pixel classification learning task and apply a novel Fully Convolutional Network (FCN) architecture that operates at multiple image scales, including full resolution. The FCN is trained to optimize a continuous version of the Pseudo F-measure metric and an ensemble of FCNs outperform the competition winners on 4 of 7 DIBCO competitions. This same binarization technique can also be applied to different domains such as Palm Leaf Manuscripts with good performance. We analyze the performance of the proposed model w.r.t. the architectural hyperparameters, size and diversity of training data, and the input features chosen.

[1]  Abderrahmane Kefali,et al.  Foreground-Background Separation by Feed-forward Neural Networks in Old Manuscripts , 2014, Informatica.

[2]  Ioannis Pratikakis,et al.  ICDAR 2009 Document Image Binarization Contest (DIBCO 2009) , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[3]  Wael Abd-Almageed,et al.  Learning document image binarization from data , 2015, 2016 IEEE International Conference on Image Processing (ICIP).

[4]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[6]  Elisa H. Barney Smith,et al.  Effect of "Ground Truth" on Image Binarization , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[7]  Ioannis Pratikakis,et al.  ICDAR 2013 Document Image Binarization Contest (DIBCO 2013) , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[8]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[10]  Ioannis Pratikakis,et al.  Performance Evaluation Methodology for Historical Document Image Binarization , 2013, IEEE Transactions on Image Processing.

[11]  Salvador España Boquera,et al.  Insights on the Use of Convolutional Neural Networks for Document Image Binarization , 2015, IWANN.

[12]  Marcus Liwicki,et al.  Document Image Binarization using LSTM: A Sequence Learning Approach , 2015, HIP@ICDAR.

[13]  Konstantinos Zagoris,et al.  ICFHR2016 Handwritten Document Image Binarization Contest (H-DIBCO 2016) , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[14]  Nicholas R. Howe,et al.  Document binarization with automatic parameter tuning , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[15]  Ioannis Pratikakis,et al.  ICFHR 2012 Competition on Handwritten Document Image Binarization (H-DIBCO 2012) , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[16]  Abdel Belaïd,et al.  Neural based binarization techniques , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[17]  Shijian Lu,et al.  Robust Document Image Binarization Technique for Degraded Document Images , 2013, IEEE Transactions on Image Processing.

[18]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[19]  Salvador España Boquera,et al.  F-Measure as the Error Function to Train Neural Networks , 2013, IWANN.

[20]  Hadi Setiawan,et al.  ICFHR2016 Competition on the Analysis of Handwritten Text in Images of Balinese Palm Leaf Manuscripts , 2016 .

[21]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ioannis Pratikakis,et al.  ICFHR2014 Competition on Handwritten Document Image Binarization (H-DIBCO 2014) , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[24]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[26]  Ioannis Pratikakis,et al.  H-DIBCO 2010 - Handwritten Document Image Binarization Competition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.