Efficient chain-code-based image manipulation for handwritten word recognition

Efficient image handling in the handwritten document recognition is an important research issue in real time applications. Image manipulation procedures for a fast handwritten word recognizer, including pre-processing, segmentation, and feature extraction, have been implemented using the chain code representation and presented in this paper. Pre-processing includes noise removal, slant correction and smoothing of contours. Slant angle is estimated by averaging orientation angles of vertical strokes. Smoothing removes jaggedness on contours. Segmentation points are determined using ligatures and concavity features. Average stroke width of an image is used in an adaptive fashion to locate ligatures. Concavities are located by examination of slope changes in contours. Feature extraction efficiently converts a segment into feature vectors. Experimental results demonstrate the efficiency of the algorithms developed. Three-thousand word images captured from real mail pieces, with size of 217 by 82 in average, are used in the experiments. Average processing times taken for each module are 10, 15, and 34 msec on a single Sparc 10 for pre-processing, segmentation, and feature extraction, respectively.