Training an End-to-End Model for Offline Handwritten Japanese Text Recognition by Generated Synthetic Patterns

This paper presents an end-to-end model of Deep Convolutional Recurrent Network (DCRN) for recognizing offline handwritten Japanese text lines. The end-to-end DCRN model has three parts: a convolutional feature extractor using Deep Convolutional Neural Network (DCNN) to extract a feature sequence from a text line image; recurrent layers employing a Deep Bidirectional LSTM to predict pre-frame from the feature sequence; and a transcription layer using Connectionist Temporal Classification (CTC) to convert the pre-frame predictions into the label sequence. Since our end-to-end model requires a large data for training, we synthesize handwritten text line images from sentences in corpora and handwritten character patterns in the Nakayosi and Kuchibue database with elastic distortions. In the experiment, we evaluate the performance of the end-to-end model and the effectiveness of the synthetic data generation method on the test set of the TUAT Kondate database. The results of the experiments show that our end-to-end model achieves higher than the state-of-the-art recognition accuracy on the test set of TUAT Kondate with 96.35% and 98.05% character level recognition accuracies without and with the generated synthetic data, respectively.