Improved Arabic handwriting word segmentation approach using Random Forests

In this work, an approach for Arabic handwriting word segmentation is proposed. In this approach words are over-segmented and the segmentation points (SPs) are then validated. As the validation stage accuracy controls the whole system accuracy, an improved validation approach is proposed to alleviate other approaches' limitations and enhances the accuracy. In this validation approach, a set of zoning features are extracted and used to train an efficient Random Forests (RF) ensemble of classifiers. These features are considered here due to their strength in capturing local as well as global characteristics of handwritten characters. The proposed approach is tested using 500 words from the standard IFN/ENIT database. Additionally, its accuracy is compared against one of the recent and efficient approaches which utilizes the modified directional features (MDF) and neural network classifier. These results prove the accuracy of the proposed approach and its ability to alleviate the limitations found in the previous techniques.