NIST Special Database 8 Machine Print Database

This report describes the NIST Machine Print Database, NIST Special Database 8 (SD8), which contains 360 8-bit gray scale images of pages containing machine printed characters, and a corresponding binary version of each page, resulting in a total of 720 digitized pages. This database is being distributed as a common set of images for use in the development and testing of Optical Character Recognition (OCR) systems. This allows vendors to report results with respect to this common image set. Each disc in this three-disc set contains approximately 593 Megabytes of storage when the images are compressed. Uncompressed each disc contains 1.1 Gigabytes of data (1.85 : 1 average compression ratio using JPEG[3] and CCITT Group 4[1]).