Optimizing Feature Selection for Recognizing Handwritten Arabic Characters

Abstract — Recognition of characters greatly depends upon the features used. Several features of the handwritten Arabic characters are selected and discussed. An off-line recognition system based on the selected features was built. The system was trained and tested with realistic samples of handwritten Arabic characters. Evaluation of the importance and accuracy of the selected features is made. The recognition based on the selected features give average accuracies of 88% and 70% for the numbers and letters, respectively. Further improvements are achieved by using feature weights based on insights gained from the accuracies of individual features. Keywords — Arabic handwritten characters, Feature extraction, Off-line recognition, Optical character recognition. I. I NTRODUCTIONRABIC writing system differs from European systems by some differences [1]. It is written from right to left and it is always cursive whether handwritten or printed. It contains 28 characters. Six of them can be connected only from the righthand side and have two shapes (connected and standalone). The rest can be connected from either or both sides. Hence, there are four shapes for these 22 characters according to the location of the character in the word (start, middle, end, or standalone). Some of the characters have secondary objects like a dot or combination of dots (one, two, or three). In fact, some characters can only be distinguished from others only by these secondary features. A comparison of various characteristics of Arabic, Latin and other languages are discussed in many references [2]. Many off-line Arabic character recognition techniques for printed text were published [3]. The state of the art of this type of recognition achieves good accuracy. There are even some successful commercial products for this application [4]. However, the off-line recognition of handwritten Arabic characters is more difficult and users are still waiting for reliable and accurate solutions. Characters are recognized by human eyes via recognition of features associated with the characters. Human eyes are trained to the recognition of these features and associate features to corresponding characters. Hence as a result of this association, the human brain uses the feature information to