Classification of Hebrew calligraphic handwriting styles: preliminary results

We present preliminary results for document classification of ancient Hebrew manuscripts. The main goal is to analyze documents of different writing styles in order to identify the locations, the dates, and the writer of the test documents. This analysis depends crucially on good binarization of the corrupted manuscripts. We propose an accurate method for binarization of the manuscripts. We further propose and test topological features for handwriting style classification based a selected subset of the Hebrew alphabet. In our preliminary experiments we have used only two characters, the character Aleph and the character Lamed. Our results so far yield 100% correct classification of a set of fourteen documents written by fourteen different writers.

[1]  A FletcherLloyd,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988 .

[2]  Ehud Rivlin,et al.  Offline cursive script word recognition – a survey , 1999, International Journal on Document Analysis and Recognition.

[3]  Tieniu Tan,et al.  Personal identification based on handwriting , 2000, Pattern Recognit..

[4]  Henri Maître,et al.  An expert vision system for analysis of Hebrew characters and authentication of manuscripts , 1991, Pattern Recognit..

[5]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[6]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[7]  George Nagy,et al.  Twenty Years of Document Image Analysis in PAMI , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Wayne Nilback An introduction to digital image processing , 1985 .

[9]  I. Dinstein,et al.  Ancient Hebraic Handwriting Identification with Run-Length Histograms , 1982 .

[10]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  George Nagy,et al.  HIERARCHICAL REPRESENTATION OF OPTICALLY SCANNED DOCUMENTS , 1984 .

[12]  Rama Chellappa,et al.  Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Naohisa Komatsu,et al.  A proposal for a text-indicated writer verification method , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[14]  Yan Chen,et al.  Comparison of some thresholding algorithms for text/background segmentation in difficult document images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[15]  Sung-Hyuk Cha,et al.  MULTIPLE FEATURE INTEGRATION FOR WRITER VERIFICATION , 2004 .

[16]  Hon-Son Don,et al.  A noise attribute thresholding method for document image binarization , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[17]  Its'hak Dinstein,et al.  Adaptive Directional Morphology with Application to Document Analysis , 1996, ISMM.

[18]  Sargur N. Srihari,et al.  Document Image Binarization Based on Texture Features , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Vassilis Anastassopoulos,et al.  Fusion of correlated decisions for writer verification , 1999, Pattern Recognit..

[21]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[22]  Robert M. Haralick,et al.  An Optimization Methodology for Document Structure Extraction on Latin Character Documents , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[24]  Anil K. Jain,et al.  Feature extraction methods for character recognition-A survey , 1996, Pattern Recognit..

[25]  Yan Solihin,et al.  Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Eric Lecolinet,et al.  A Survey of Methods and Strategies in Character Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Toyohide Watanabe,et al.  Character extraction from noisy background for an automatic reference system , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[28]  Chew Lim Tan,et al.  Removal of interfering strokes in double-sided document images , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[29]  Alfred M. Bruckstein,et al.  A new method for image segmentation , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.