Arabic character recognition system: A statistical approach for recognizing cursive typewritten text

Abstract Character recognition systems can contribute tremendously to the advancement of automation process, and can improve the interface between man and machine (computers) in many applications, including office automation and data entry. In this report we present a recognition system for typed Arabic text, which involves a statistical approach for character recognition. This approach uses “Accumulative Invariant Moments” as an identifier, which helped in the segmentation of connected and overlapping Arabic characters. However, Invariant Moments proved to be very sensitive to slight changes in a character shape. These changes are normally due to typing and the scanning process, and cannot be avoided. The recognition zone was defined based on the mean and standard deviation for the moments of a large sample of each character. However, this zone was increased, using an empirical multiplier, to improve recognition rate. The system was implemented on a mainframe in APL programming language for ease of experimentation, and then transported to a PC environment in BASIC for better portability. The recognition rate achieved was 94%, with a recognition speed of 10.6 characters/minute, running on a PC/AT with a math co-processor.

[1]  Olivier D. Faugeras,et al.  HYPER: A New Approach for the Recognition and Positioning of Two-Dimensional Objects , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Hussein Almuallim,et al.  A Method of Recognition of Arabic Cursive Handwriting , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.