Digital Library of India: A Testbed for Indian Language Research

This paper describes the goal of the Universal Digital Library Project (UDL) and presents the approach taken by-and the technological challenges associated with-the Million Books to the Web Project (MBP). The Digital Library of India (DLI) initiative, which is the Indian part of the UDL and MBP, is discussed. DLI fosters a large number of research activities in areas such as text summarization, information retrieval, machine translation and transliteration, optical character recognition, handwriting recognition, and natural language parsing and morphological analyses. This paper provides an overview of the activities of DLI in these areas and shows how DLI serves as a multilingual resource.