Handprinted Chinese Character Recognition Using the Probability Distribution Feature

A simplified Bayes rule is used to classify 5401 categories of handwritten Chinese characters. The main feature for the Bayes rule deals with the probability distribution of black pixels of a thinned character. Our idea is that each Chinese character indicated by the black pixels represents a probability distribution in a two-dimensional plane. Therefore, an unknown pattern is classified into one of 5401 different distributions by the Bayes rule. Since the handwritten character has an irregular shape variation, the whole character is normalized and then thinned. Finally, a transformation is used to spread the black pixels uniformly over the whole square plane, but it still keeps the relative positions of the original black pixels. The main feature gives an 88.65% recognition rate. In order to raise the recognition rate, 4 more subsidiary features are elaborately selected such that they are not affected much by the irregularly shaped variation. The 4 features raise the recognition rate to 93.43%. A 99.30% recognition rate is achieved if the top 10 categories of HCC are selected by our recognition method and 99.61% if the top 20 are selected.