Subjective and objective quality assessment of degraded document images

Abstract The huge amount of degraded documents stored in libraries and archives around the world needs automatic procedures of enhancement, classification, transliteration, etc. While high-quality images of these documents are in general easy to be captured, the amount of damage these documents contain before imaging is unknown. It is highly desirable to measure the severity of degradation that each document image contains. The degradation assessment can be used in tuning parameters of processing algorithms, selecting the proper algorithm, finding damaged or exceptional documents, among other applications. In this paper, the first dataset of degraded document images along with the human opinion scores for each document image is introduced in order to evaluate the image quality assessment metrics on historical document images. In this research, human judgments on the overall quality of the document image are used instead of the previously used OCR performance. Also, we propose an objective no reference quality metric based on the statistics of the mean subtracted contrast normalized (MSCN) coefficients computed from segmented layers of each document image. The segmentation into four layers of foreground and background is done on the basis of an analysis of the log-Gabor filters. This segmentation is based on the assumption that the sensitivity of the human visual system (HVS) is different at the locations of text and non-text. Experimental results show that the proposed metric has comparable or better performance than the state-of-the-art metrics, while it has a moderate complexity. The developed dataset as well as the Matlab source code of the proposed metric is available at http://www.synchromedia.ca/system/files/VDIQA.zip .

[1]  Weisi Lin,et al.  Learning a blind quality evaluation engine of screen content images , 2016, Neurocomputing.

[2]  Rafael Dueire Lins A Taxonomy for Noise in Images of Paper Documents - The Physical Noises , 2009, ICIAR.

[3]  Zhou Wang,et al.  Quality-aware images , 2006, IEEE Transactions on Image Processing.

[4]  Ioannis Pratikakis,et al.  ICDAR 2011 Document Image Binarization Contest (DIBCO 2011) , 2011, 2011 International Conference on Document Analysis and Recognition.

[5]  Haida Liang,et al.  Advances in multispectral and hyperspectral imaging for archaeology and art conservation , 2012 .

[6]  Patrick Kelly,et al.  Quality assessment and restoration of typewritten document images , 1999, International Journal on Document Analysis and Recognition.

[7]  Anna Tonazzini,et al.  Digital restoration of ancient color manuscripts from geometrically misaligned recto-verso pairs , 2016 .

[8]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[9]  Ishwar K. Sethi,et al.  Handwriting Quality Evaluation , 2001, ICAPR.

[10]  Nicolai Petkov,et al.  Edge and line oriented contour detection: State of the art , 2011, Image Vis. Comput..

[11]  Abdul Rehman,et al.  Reduced-Reference Image Quality Assessment by Structural Similarity Estimation , 2012, IEEE Transactions on Image Processing.

[12]  Mohamed Cheriet,et al.  Mean Deviation Similarity Index: Efficient and Reliable Full-Reference Image Quality Evaluator , 2016, IEEE Access.

[13]  David S. Doermann,et al.  Learning features for predicting OCR accuracy , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[14]  Brian L. Evans,et al.  Full-reference visual quality assessment for synthetic images: A subjective study , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[15]  Weisi Lin,et al.  Perceptual Quality Assessment of Screen Content Images , 2015, IEEE Transactions on Image Processing.

[16]  Véronique Eglin,et al.  Document images analysis solutions for digital libraries , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[17]  Alberto Leon-Garcia,et al.  Estimation of shape parameter for generalized Gaussian distributions in subband decompositions of video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[18]  A. G. Ramakrishnan,et al.  QUAD: quality assessment of documents , 2011 .

[19]  Sheila S. Hemami,et al.  VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images , 2007, IEEE Transactions on Image Processing.

[20]  Christophe Charrier,et al.  Blind Image Quality Assessment: A Natural Scene Statistics Approach in the DCT Domain , 2012, IEEE Transactions on Image Processing.

[21]  Le Kang,et al.  A deep learning approach to document image quality assessment , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[22]  David S. Doermann,et al.  Real-Time No-Reference Image Quality Assessment Based on Filter Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Alan C. Bovik,et al.  Perceptual quality prediction on authentically distorted images using a bag of features approach , 2016, Journal of vision.

[24]  D. Ruderman The statistics of natural images , 1994 .

[25]  Seong-Whan Lee,et al.  Automatic quality measurement of gray-scale handwriting based on extended average entropy , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[26]  Alan C. Bovik,et al.  A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[27]  Apostolos Antonacopoulos,et al.  Special issue on the analysis of historical documents , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[28]  Yannick Berthoumieu,et al.  Multiscale skewed heavy tailed model for texture analysis , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[29]  Fei Zhou,et al.  MDID: A multiply distorted image database for image quality assessment , 2017, Pattern Recognit..

[30]  Stefan Winkler,et al.  Analysis of Public Image and Video Databases for Quality Assessment , 2012, IEEE Journal of Selected Topics in Signal Processing.

[31]  Hua Huang,et al.  No-reference image quality assessment in curvelet domain , 2014, Signal Process. Image Commun..

[32]  Alan C. Bovik,et al.  Image information and visual quality , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33]  Gady Agam,et al.  Character-Based Automated Human Perception Quality Assessment in Document Images , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[34]  S.N. Srihari,et al.  Image quality and readability , 1995, Proceedings., International Conference on Image Processing.

[35]  Kishor M. Bhurchandi,et al.  No-reference image quality assessment algorithms: A survey , 2015 .

[36]  Rumi Tokunaga,et al.  The modern Japanese color lexicon. , 2017, Journal of vision.

[37]  David S. Doermann,et al.  Sharpness estimation for document and scene images , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[38]  David S. Doermann,et al.  Document Image Quality Assessment: A Brief Survey , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[39]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[40]  D.J. Walvoord,et al.  Digital Transcription of the Archimedes Palimpsest [Applications Corner] , 2008, IEEE Signal Processing Magazine.

[41]  Mohamed Cheriet,et al.  Influence of Color-to-Gray Conversion on the Performance of Document Image Binarization: Toward a Novel Optimization Problem , 2015, IEEE Transactions on Image Processing.

[42]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[43]  Yong Liu,et al.  No-reference document image quality assessment based on high order image statistics , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[44]  Peter Kovesi,et al.  Symmetry and Asymmetry from Local Phase , 1997 .

[45]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[46]  Jean-Marc Ogier,et al.  Combining Focus Measure Operators to Predict OCR Accuracy in Mobile-Captured Document Images , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[47]  Zhou Wang,et al.  Reduced-Reference Image Quality Assessment Using Divisive Normalization-Based Image Representation , 2009, IEEE Journal of Selected Topics in Signal Processing.

[48]  Alan C. Bovik,et al.  Blind Image Quality Assessment: From Natural Scene Statistics to Perceptual Quality , 2011, IEEE Transactions on Image Processing.