WRITER IDENTIFICATION OF ARABIC TEXT USING STATISTICAL AND STRUCTURAL FEATURES

This article addresses writer identification of handwritten Arabic text. Several types of structural and statistical features were extracted from Arabic handwriting text. A novel approach was used to extract structural features that build on some of the main characteristics of the Arabic language. Connected component features for Arabic handwritten text as well as gradient distribution features, windowed gradient distribution features, contour chain code distribution features, and windowed contour chain code distribution features were extracted. A nearest neighbor (NN) classifier was used with the Euclidean distance measure. Data reduction algorithms (viz. principal component analysis [PCA], linear discriminant analysis [LDA], multiple discriminant analysis [MDA], multidimensional scaling [MDS], and forward/backward feature selection algorithm) were used. A database of 500 paragraphs handwritten in Arabic by 250 writers was used. The paragraphs used were randomly generated from a large corpus. NN provided the best accuracy in text-independent writer identification with top-1 result of 88.0%, top-5 result of 96.0%, and top-10 result of 98.5% for the first 100 writers. Extending the work to include all 250 writers and with the backward feature selection algorithm (using 54 out of 83 features), the system attained a top-1 result of 75.0%, top-5 result of 91.8%, and top-10 result of 95.4%.

[1]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[2]  Yuan Yan Tang,et al.  A novel method for offline handwriting-based writer identification , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[3]  Hanêne Ben-Abdallah,et al.  A novel approach for off-line Arabic writer identification based on stroke feature combination , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[4]  Louis Vuurpijl,et al.  Forensic writer identification: a benchmark data set and a comparison of two systems , 2000 .

[5]  Horst Bunke,et al.  A writer identification and verification system using HMM based recognizers , 2006, Pattern Analysis and Applications.

[6]  Mohammad Rahmati,et al.  Comparison of Gabor-Based Features for Writer Identification of Farsi/Arabic Handwriting , 2006 .

[7]  Sergios Theodoridis,et al.  Pattern Recognition, Fourth Edition , 2008 .

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Najoua Essoukri Ben Amara,et al.  Arabic Handwriting Texture Analysis for Writer Identification Using the DWT-Lifting Scheme , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[10]  Sargur N. Srihari,et al.  A statistical model for writer verification , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[11]  Somaya Al-Máadeed,et al.  Writer identification of Arabic handwriting documents using grapheme features , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[12]  M.E. Moghaddam,et al.  A Text-Independent Persian Writer Identification System Using LCS Based Classifier , 2008, 2008 IEEE International Symposium on Signal Processing and Information Technology.

[13]  Lambert Schomaker,et al.  Combining multiple features for text-independent writer identification and verification , 2006 .

[14]  Dimitris K. Agrafiotis,et al.  Stochastic proximity embedding , 2003, J. Comput. Chem..

[15]  Mohammad Rahmati,et al.  A New Method for Writer Identification and Verification Based on Farsi/Arabic Handwritten Texts , 2007 .

[16]  Sargur N. Srihari,et al.  Discriminatory power of handwritten words for writer recognition , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[17]  Sargur N. Srihari,et al.  Writer Verification of Arabic Handwriting , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[18]  Najoua Essoukri Ben Amara,et al.  Writer Identification Using Modular MLP Classifier and Genetic Algorithm for Optimal Features Selection , 2006, ISNN.

[19]  Katrin Franke,et al.  Ink texture analysis for writer identification , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[20]  Lambert Schomaker,et al.  Text-Independent Writer Identification and Verification on Offline Arabic Handwriting , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[21]  Jakob Sternby On-Line Signature Verification by Explicit Solution to the Point Correspondence Problem , 2004 .

[22]  Ayman Al-Dmour,et al.  Arabic writer identification based on hybrid spectral–statistical measures , 2007, J. Exp. Theor. Artif. Intell..

[23]  Mario Köppen,et al.  A computer-based system to support forensic studies on handwritten documents , 2001, International Journal on Document Analysis and Recognition.

[24]  F. Nejad,et al.  A New Method for Writer Identification and Verification Based on Farsi/Arabic Handwritten Texts , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[25]  Nicole Vincent,et al.  A Set of Chain Code Based Features for Writer Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[26]  Lambert Schomaker,et al.  Layout Analysis of Handwritten Historical Documents for Searching the Archive of the Cabinet of the Dutch Queen , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[27]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Mohsen Ebrahimi Moghaddam,et al.  A writer identification method based on XGabor and LCS , 2009, IEICE Electron. Express.

[29]  Lambert Schomaker,et al.  A comparison of clustering methods for writer identification and verification , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[30]  Abraham Kandel,et al.  Introduction to Pattern Recognition: Statistical, Structural, Neural and Fuzzy Logic Approaches , 1999 .

[31]  Sargur N. Srihari,et al.  Handwriting pattern matching and retrieval with binary features , 2003 .

[32]  Somaya Al-Máadeed,et al.  Writer identification using edge-based directional probability distribution features for arabic words , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[33]  Mohsen Ebrahimi Moghaddam,et al.  Text-independent Persian Writer Identification Using Fuzzy Clustering Approach , 2009, 2009 International Conference on Information Management and Engineering.

[34]  Sargur N. Srihari,et al.  Individuality of handwritten characters , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[35]  Réjean Plamondon Progress in Automatic Signature Verification , 1994 .

[36]  Roy Huber,et al.  Handwriting Identification: Facts and Fundamentals , 1999 .

[37]  Herbert Freeman,et al.  Computer Processing of Line-Drawing Images , 1974, CSUR.

[38]  Zhenyu He,et al.  Writer identification using global wavelet-based features , 2008, Neurocomputing.

[39]  Alicia Fornés,et al.  Writer Identification in Old Handwritten Music Scores , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[40]  Andreas Schlapbach Writer identification and verification , 2008 .

[41]  Vassilis Anastassopoulos,et al.  Morphological waveform coding for writer identification , 2000, Pattern Recognit..

[42]  Lambert Schomaker,et al.  Automatic writer identification using connected-component contours and edge-based features of uppercase Western script , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Mohsen Ebrahimi Moghaddam,et al.  A Persian Writer Identification Method Based on Gradient Features and Neural Networks , 2009, 2009 2nd International Congress on Image and Signal Processing.

[44]  Réjean Plamondon,et al.  Automatic signature verification and writer identification - the state of the art , 1989, Pattern Recognit..

[45]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[46]  Mohsen Ebrahimi Moghaddam,et al.  A text-independent Persian writer identification based on feature relation graph (FRG) , 2010, Pattern Recognit..

[47]  Sung-Hyuk Cha,et al.  Evaluation of Biometric Identification in Open Systems , 2005, AVBPA.

[48]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[49]  Constantin Papaodysseus,et al.  Automatic Writer Identification of Ancient Greek Inscriptions , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Lambert Schomaker,et al.  Automatic Handwriting Identification on Medieval Documents , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[51]  Ashraf A. Zaher,et al.  A hybrid ANN-based technique for signature verification , 2010, CI 2010.

[52]  Juan A. Sigüenza,et al.  Writer Identification Method Based on Forensic Knowledge , 2004, ICBA.

[53]  Lambert Schomaker,et al.  Towards robust writer verification by correcting unnatural slant , 2011, Pattern Recognit. Lett..

[54]  Patrick Kelly,et al.  Script and language identification for handwritten document images , 1999, International Journal on Document Analysis and Recognition.

[55]  Sung-Hyuk Cha,et al.  Assessing the authorship confidence of handwritten items , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[56]  Lambert Schomaker,et al.  Sparse-parametric writer identification using heterogeneous feature groups , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[57]  Lambert Schomaker,et al.  How much handwritten text is needed for text-independent writer verification and identification , 2008, 2008 19th International Conference on Pattern Recognition.

[58]  Graham Leedham,et al.  Writer identification using innovative binarised features of handwritten numerals , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[59]  Horst Bunke,et al.  Using HMM based recognizers for writer identification and verification , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[60]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[61]  Josep Lladós,et al.  Unsupervised writer adaptation of whole-word HMMs with application to word-spotting , 2010, Pattern Recognit. Lett..

[62]  Sargur N. Srihari,et al.  Comparison of statistical models for writer verification , 2009, Electronic Imaging.

[63]  Najoua Essoukri Ben Amara,et al.  Neural Networks and Support Vector Machines Classifiers for Writer Identification Using Arabic Script , 2008, Int. Arab J. Inf. Technol..

[64]  Mohsen Ebrahimi Moghaddam,et al.  Persian Writer Identification Using Extended Gabor Filter , 2008, ICIAR.

[65]  Horst Bunke,et al.  Off-line handwriting identification using HMM based recognizers , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[66]  Rafael C. González,et al.  Digital image processing, 3rd Edition , 2008 .

[67]  Sung-Hyuk Cha,et al.  Individuality of handwriting. , 2002, Journal of forensic sciences.