Handwritten character segmentation for Kannada scripts

Character segmentation has become a crucial task for character recognition in many OCR systems. It is an important step because incorrectly segmented characters are unlikely to be recognized correctly. For segmenting a cursive scripts leads more challenging because of presence of more touching characters. Kannada is the one of the popular language in south India and also some of the letters in Kannada language are cursive in nature. In this paper, a new character segmentation algorithm for unconstrained handwritten Kannada scripts is presented. The proposed method is based on thinning, branch point and mixture models. The expectation-maximization (EM) algorithm is used to learn the mixture of Gaussians. We have used a cluster mean points to estimate the direction and branch point as reference points for segmenting characters. We have experimentally evaluated our proposed method on Kannada words and it has shown encouraging result.