Benchmark Datasets for Offline Handwritten Gurmukhi Script Recognition

Handwritten character recognition is an imperative issue in the field of pattern recognition and machine learning research. In the recent years, several techniques for handwritten character recognition have been proposed. Due to the lack of publicly accessible benchmark datasets of Gurmukhi script, no extensive comparisons have been undertaken between those techniques, especially for this script. Over the years, datasets and benchmarks have proven their fundamental importance in character recognition research, and objective comparisons in many fields. This paper presents a collection of seven benchmark datasets (HWR-Gurmukhi_1.1, HWR-Gurmukhi_1.2, HWR-Gurmukhi_1.3, HWR-Gurmukhi_2.1, HWR-Gurmukhi_2.2, HWR-Gurmukhi_2.3, and HWR-Gurmukhi_3.1) with different sizes for offline handwritten Gurmukhi character recognition collected from various public places. A few exploratory outcomes based on precision, False Acceptance Rate (FAR), and False Rejection Rate (FRR) using different classification techniques, namely, k-NN, RBF-SVM, MLP, Neural Network, Decision Tree, and Random Forest are also presented in this paper.