Post-processing Study of Chinese Character Recognition Based on N-gram Language Model

In order to improve Chinese text recognition rate, a method of automatic post-processing which combines N-gram language model with single character recognition (SCR) model is proposed in this paper. A bounded sequence of Chinese characters (more often, a sentence) is processed as an unit. And the co-occurrence probabilities between characters and Viterbi algorithm are employed. For Chinese text, a post-processing is automatically processed. The experiment proved that the average accuracy rate of Chinese character recognition is increased from 97.62% to 98.71%.