Indexation of Syriac manuscripts using directional features

This paper presents a method to assist the indexation of digitized Syriac manuscripts. Syriac belongs to the Aramaic branch of Semitic languages, it is written from right to left intentionally tilted by an angle of approximately 45°. The proposed method is based on a word spotting approach that should locate all the occurrences of a certain query word image. The method is based on a selective sliding window technique from which directional features are extracted. Matching between features is done using Euclidean distance correspondence. The proposed method does not require any prior information, it is also fully independent of a word to character segmentation algorithm, which would be extremely difficult to realize due to the tilted nature of the handwriting.

[1]  R. Manmatha,et al.  A scale space approach for automatically segmenting words from historical handwritten documents , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Kengo Terasawa,et al.  Word Spotting for Historical Document Images with Eigenspace Methods and DTW , 2006 .

[3]  Kengo Terasawa,et al.  Eigenspace method for text retrieval in historical document images , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[4]  W. F. Clocksin,et al.  Towards automatic transcription of Syriac handwriting , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[5]  Stéphane Bres,et al.  Robust directional features for wordspotting in degraded Syriac manuscripts , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[6]  Richard O. Duda,et al.  Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.

[7]  Véronique Eglin,et al.  Hermite and Gabor transforms for noise reduction and handwriting classification in ancient manuscripts , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[8]  Frank Lebourgeois,et al.  Omnilingual segmentation-free word spotting for ancient manuscripts indexation , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[9]  Frank Lebourgeois,et al.  Text search for medieval manuscript images , 2007, Pattern Recognit..

[10]  William F. Clocksin Handwritten Syriac character recognition using order structure invariance , 2004, ICPR 2004.

[11]  Yuzuru Tanaka,et al.  Locality Sensitive Pseudo-Code for Document Images , 2007 .