MHDID: A Multi-distortion Historical Document Image Database

In this paper, a new dataset, called Multi-distortion Historical Document Image Database (MHDID), to be used for the research on quality assessment of degraded documents and degradation classification is proposed. The MHDID dataset contains 335 historical document images which are classified into four categories based on their distortion types, namely, paper translucency, stain, readers’ annotations and worn holes. A total of 36 subjects participated to judge the quality of ancient document images. Pair comparison rating (PCR) is utilized as a subjective rating method for evaluating the visual quality of degraded document images. For each distortion image a mean opinion score (MOS) value is computed. This dataset could be used for evaluating the image quality assessment (IQA) measures as well as in the design of new metrics.

[1]  Mohamed A. Deriche,et al.  Towards the design of a consistent image contrast enhancement evaluation measure , 2017, Signal Process. Image Commun..

[2]  David S. Doermann,et al.  A Dataset for Quality Assessment of Camera Captured Document Images , 2013, CBDAR.

[3]  David S. Doermann,et al.  Sharpness estimation for document and scene images , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[4]  Sabine Süsstrunk,et al.  Measuring colorfulness in natural images , 2003, IS&T/SPIE Electronic Imaging.

[5]  Rafael Dueire Lins A Taxonomy for Noise in Images of Paper Documents - The Physical Noises , 2009, ICIAR.

[6]  Mohamed Cheriet,et al.  Subjective and objective quality assessment of degraded document images , 2017 .

[7]  Lukas Krasula Quality Assessment Methodologies of Post-Processed Images , 2017 .

[8]  David S. Doermann,et al.  Document Image Quality Assessment: A Brief Survey , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[9]  Ching Y. Suen,et al.  Databases for recognition of handwritten Arabic cheques , 2003, Pattern Recognit..

[10]  Fei Zhou,et al.  MDID: A multiply distorted image database for image quality assessment , 2017, Pattern Recognit..

[11]  Shiaw-Shian Yu,et al.  Sorting qualities of handwritten Chinese characters for setting up a research database , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[12]  Weisi Lin,et al.  Perceptual Quality Assessment of Screen Content Images , 2015, IEEE Transactions on Image Processing.

[13]  Azeddine Beghdadi,et al.  Selecting Low-level Features for Image Quality Assessment by Statistical Methods , 2010, J. Comput. Inf. Technol..

[14]  S. Süsstrunk,et al.  Measuring colourfulness in natural images , 2003 .

[15]  Stefan Winkler,et al.  Analysis of Public Image and Video Databases for Quality Assessment , 2012, IEEE Journal of Selected Topics in Signal Processing.

[16]  Gady Agam,et al.  Character-Based Automated Human Perception Quality Assessment in Document Images , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.