A deep learning approach to document image quality assessment

This paper proposes a deep learning approach for document image quality assessment. Given a noise corrupted document image, we estimate its quality score as a prediction of OCR accuracy. First the document image is divided into patches and non-informative patches are sifted out using Otsu's binarization technique. Second, quality scores are obtained for all selected patches using a Convolutional Neural Network (CNN), and the patch scores are averaged over the image to obtain the document score. The proposed CNN contains two layers of convolution, location blind max-min pooling, and Rectified Linear Units in the fully connected layers. Experiments on two document quality datasets show our method achieved the state of the art performance.

[1]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  David S. Doermann,et al.  A Dataset for Quality Assessment of Camera Captured Document Images , 2013, CBDAR.

[3]  Patrick Kelly,et al.  Quality assessment and restoration of typewritten document images , 1999, International Journal on Document Analysis and Recognition.

[4]  David S. Doermann,et al.  Document Image Quality Assessment: A Brief Survey , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[5]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[6]  Yi Li,et al.  Convolutional Neural Networks for No-Reference Image Quality Assessment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  David S. Doermann,et al.  Sharpness estimation for document and scene images , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[8]  Y-Lan Boureau,et al.  Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[9]  David S. Doermann,et al.  Real-Time No-Reference Image Quality Assessment Based on Filter Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[11]  Raja Bala,et al.  Mobile Video Capture of Multi-page Documents , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Satoshi Naoi,et al.  Automatic filter selection using image quality assessment , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[13]  Thomas A. Nartker,et al.  Prediction of OCR accuracy using simple image features , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[14]  David S. Doermann,et al.  Unsupervised feature learning framework for no-reference image quality assessment , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[16]  David S. Doermann,et al.  Learning features for predicting OCR accuracy , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).