Character and Text Recognition of Khmer Historical Palm Leaf Manuscripts

This paper presents methods for two historical document analysis tasks on digitized Khmer palm leaf manuscripts. The first task consisting of isolated character recognition is conducted utilizing different types of neural network architectures such as CNN, LSTM-RNN, and a combination of both. The second task focuses on recognizing word/text image patches of variable length and simultaneously localizing each glyph in the text image. For this task, according to the characteristic of Khmer writing system, both one-dimensional and two-dimensional RNN are used.