On-line Script Recognition

Automatic identification of handwritten script facilitates many important applications such as automatic transcription of multi-lingual documents and search for documents on the Internet containing a particular script. The increase in usage of handheld devices which accept handwritten input is creating a huge volume of handwritten data. We propose a method to classify words and lines in an online handwritten document into Arabic, Cyrillic, Devnagari, Han, Hebrew and Roman scripts. The proposed classification system, based on spatial and temporal features of the strokes, attained an overall classification accuracy of 86.5% at the word level on a dataset containing 13, 379 words. The classification accuracy improves to 95% as the number of words in the test sample is increased to five and to 95.1% for complete text lines.

[1]  Patrick Kelly,et al.  Script and language identification for handwritten document images , 1999, International Journal on Document Analysis and Recognition.

[2]  Anil K. Jain,et al.  Structure in on-line documents , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[3]  Jay J. Lee,et al.  A Unified Network-based Approach for Online Recognition of Multi-Lingual Cursive Handwritings , 1997 .

[4]  A. Lawrence Spitz,et al.  Determination of the Script and Language Content of Document Images , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Simeon Potter,et al.  Sign, symbol and script: An account of man's efforts to write , 1970 .

[6]  Anil K. Jain,et al.  Page segmentation using tecture analysis , 1996, Pattern Recognit..

[7]  Tieniu Tan,et al.  Rotation Invariant Texture Features and Their Use in Automatic Script Identification , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  William Bright,et al.  The Blackwell encyclopedia of writing systems By Florian Coulmas (review) , 2015 .

[9]  Bidyut Baran Chaudhuri,et al.  Script line separation from Indian multi-script documents , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[10]  Ching Y. Suen,et al.  Language identification of on-line documents using word shapes , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[11]  Akira Nakanishi,et al.  Writing Systems of the World , 1980 .