Digitise to Discard: 32 Million Newspaper Pages in Three Years

This paper presents a detailed case study of a newspaper digitisation project at the State and University Library, Denmark. The State and University Library is digitising 32 million newspaper pages in three years on a limited budget. The purpose of the project is twofold – digitise to discard one of the two printed newspaper copies preserved at the State and University Library and the Royal Library in Denmark, and enabling users to search and read the digital newspapers online. The digitisation of newspaper pages is performed on the basis of microfilm. The large scale digitisation project calls for innovative workflows, especially when it comes to quality control. Automatic and manual quality control processes and tools that can deal with 50,000 pages a day, one million pages a month, have been developed. The purpose of the quality control processes is to ensure a sufficient quality to enable the library to discard 32 million newspaper pages from an outdated storage facility. This supports a preservation strategy for newspapers entailing one printed copy, one microfilm copy and one JPEG 2000 file for each page instead of two printed copies and one microfilm copy. The digital newspaper pages that will be the result of the project will be accessible through the State and University Library’s own online portal, Mediestream or through partnerships with newspaper companies.