Off-line Farsi / arabic handwritten word recognition using vector quantization and hidden Markov model

In this paper a Farsi handwritten word recognition system for reading city names in postal addresses is presented. The method is based on vector quantization (VQ) and hidden Markov model (HMM). The sliding right to left window is used to extract the proper features(we have proposed four features). After feature extraction, K-means clustering is used for generation a codebook and VQ generates a codeword for each word image. In the next stage, HMM is trained by Baum Welch algorithm for each city name. A test image is recognized by finding the best match (likelihood) between the image and all of the HMM words models using forward algorithm. Experimental results show the advantages of using VQ/HMM recognizer engine instead of conventional discrete HMM.