A Study of Sindhi Related and Arabic Script Adapted languages Recognition

A large number of publications are available for the Optical Character Recognition (OCR). Significant researches, as well as articles are present for the Latin, Chinese and Japanese scripts. Arabic script is also one of mature script from OCR perspective. The adaptive languages which share Arabic script or its extended characters; still lacking the OCRs for their language. In this paper we present the efforts of researchers on Arabic and its related and adapted languages. This survey is organized in different sections, in which introduction is followed by properties of Sindhi Language. OCR process techniques and methods used by various researchers are presented. The last section is dedicated for future work and conclusion is also discussed.

[1]  S.N. Nawaz,et al.  An approach to offline Arabic character recognition using neural networks , 2003, 10th IEEE International Conference on Electronics, Circuits and Systems, 2003. ICECS 2003. Proceedings of the 2003.

[2]  Jing Li,et al.  Multiple feature cooperation based handwritten Uighur character segmentation on mobile phone , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[3]  Velappa Ganapathy,et al.  Handwritten Character Recognition Using Multiscale Neural Network Training Technique , 2008 .

[4]  Khairuddin Omar,et al.  Skew Detection and Correction Technique for Arabic Document Images Based on Centre of Gravity , 2009 .

[5]  Renu Dhir,et al.  Identification of Printed Punjabi Words and English Numerals Using Gabor Features , 2011 .

[6]  John M. Trenkle,et al.  Word-level recognition of multifont Arabic text using a feature vector matching approach , 1996, Electronic Imaging.

[7]  Rafael C. González,et al.  Digital image processing using MATLAB , 2006 .

[8]  Hiromichi Fujisawa,et al.  Forty years of research in character and document recognition - an industrial perspective , 2008, Pattern Recognit..

[9]  Husni Al-Muhtaseb,et al.  Recognition of off-line printed Arabic text using Hidden Markov Models , 2008, Signal Process..

[10]  Mohamed Fakir,et al.  Recognition of Arabic Printed Scripts by Dynamic Programming Matching Method , 1993 .

[11]  Din-Chang Tseng,et al.  A feature-preserved thinning algorithm for handwritten Chinese characters , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[12]  Mohamed Fakir,et al.  ON THE RECOGNITION OF ARABIC CHARACTERS USING HOUGH TRANSFORM TECHNIQUE , 2000 .

[13]  Fiaz Hussain,et al.  Thinning Arabic characters for feature extraction , 2001, Proceedings Fifth International Conference on Information Visualisation.

[14]  Ieee Southeastcon Southeastcon '89 : proceedings, energy and information technologies in the Southeast, April 9-12, 1989, Conference and exhibit , 1989 .

[15]  Bidyut Baran Chaudhuri,et al.  Indian script character recognition: a survey , 2004, Pattern Recognit..

[16]  Zubair A. Shaikh,et al.  Character Segmentation of Sindhi, an Arabic Style Scripting Language, using Height Profile Vector , 2009 .

[17]  Sarmad Hussain,et al.  Word Segmentation for Urdu OCR System , 2010 .

[18]  Roshan Ragel,et al.  Converting printed Sinhala documents to formatted editable text , 2010, 2010 Fifth International Conference on Information and Automation for Sustainability.

[19]  Ali Muhammad Nizamani,et al.  SINDHI OCR USING BACK PROPAGATION NEURAL NETWORK , 2013 .

[20]  U. Pal,et al.  Recognition of printed Urdu script , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[21]  Chellapilla Patvardhan,et al.  An optical character recognition system for printed Telugu text , 2004, Pattern Analysis and Applications.

[22]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[23]  A. Jagna NEW PARALLEL BINARY IMAGE THINNING ALGORITHM , 2010 .

[24]  Bidyut Baran Chaudhuri,et al.  Skew Angle Detection of Digitized Indian Script Documents , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Zhaoyang Lu,et al.  Handwritten Uighur character segmentation and performance evaluation , 2012, International Conference on Machine Vision.

[26]  Dharam Veer Sharma,et al.  Recognition of Isolated Handwritten Characters in Gurmukhi Script , 2010 .

[27]  Kamal Jambi,et al.  Design and implementation of a system for recognizing Arabic handwritten words with learning ability , 1992 .

[28]  P. S. Sastry,et al.  A font and size-independent OCR system for printed Kannada documents using support vector machines , 2002 .

[29]  Bidyut Baran Chaudhuri,et al.  Automatic recognition of printed Oriya script , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[30]  Zeeshan Bhatti,et al.  Unicode Based Bilingual Sindhi-English Pictorial Dictionary for Children , 2014 .

[31]  Rosli Salleh,et al.  A Real-time Line Segmentation Algorithm for an Offline Overlapped Handwritten Jawi Character Recognition Chip , 2007 .

[32]  Subhadip Basu,et al.  A hierarchical approach to recognition of handwritten Bangla characters , 2009, Pattern Recognit..

[33]  Rehanullah Khan,et al.  An Efficient Method for Urdu Language Text Search in Image Based Urdu Text , 2012 .

[34]  Volker Märgner,et al.  HMM based approach for handwritten arabic word recognition using the IFN/ENIT - database , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[35]  Sabri A. Mahmoud,et al.  Printed Arabic Text Recognition , 2012 .

[36]  P. Ramakanth Kumar,et al.  OCR for printed Kannada text to machine editable format using database approach , 2008, ICIA 2008.

[37]  Nasser Mozayani,et al.  A Persian OCR System Using Morphological Operators , 2007, WEC.

[38]  Muhammad Sher,et al.  HMM and fuzzy logic: A hybrid approach for online Urdu script-based languages' character recognition , 2010, Knowl. Based Syst..

[39]  Ahmad T. Al-Taani,et al.  Recognition of on-Line Arabic Handwritten Characters Using Structural Features , 2010 .

[40]  Zaher Al Aghbari,et al.  HAH manuscripts: A holistic paradigm for classifying and retrieving historical Arabic handwritten documents , 2009, Expert Syst. Appl..

[41]  Morteza Zahedi,et al.  Farsi/Arabic optical font recognition using SIFT features , 2011, WCIT.

[42]  Sarmad Hussain,et al.  Improving Nastalique specific pre-recognition process for Urdu OCR , 2009, 2009 IEEE 13th International Multitopic Conference.

[43]  Chellapilla Patvardhan,et al.  OCR of Printed Telugu Text with High Recognition Accuracies , 2006, ICVGIP.

[44]  Zhang Yi,et al.  A class of binary images thinning using two PCNNs , 2007, Neurocomputing.

[45]  Sabri A. Mahmoud,et al.  Arabic handwriting recognition using structural and syntactic pattern attributes , 2013, Pattern Recognit..

[46]  Reza Azmi,et al.  A new segmentation technique for multi font Farsi/Arabic texts , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[47]  M. Mori,et al.  Robust character recognition using adaptive feature extraction , 2008, 2008 23rd International Conference Image and Vision Computing New Zealand.

[48]  W. F. Clocksin,et al.  Spectral features for Arabic word recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[49]  Veena Bansal,et al.  Segmentation of touching and fused Devanagari characters , 2002, Pattern Recognit..

[50]  Josef Bigün,et al.  A segmentation-free approach to recognise printed Sinhala script using linear symmetry , 2004, Pattern Recognit..

[51]  Mandar Mitra,et al.  Automatic recognition of printed Oriya script , 2002 .

[52]  S. Tangwongsan,et al.  Optical Character Recognition Techniques for Restoration of Thai Historical Documents , 2008, 2008 International Conference on Computer and Electrical Engineering.

[53]  Mohammad S. Khorsheed,et al.  Offline recognition of omnifont Arabic text using the HMM ToolKit (HTK) , 2007, Pattern Recognit. Lett..

[54]  Srikanta Patnaik,et al.  Optical Character Recognition System for Urdu (Naskh Font) Using Pattern Matching Technique , 2009 .

[55]  Pierre A. MacKay,et al.  Computers and the Arabic language , 1990 .

[56]  Neil W. Bergmann,et al.  An Arabic optical character recognition system using recognition-based segmentation , 2001, Pattern Recognit..

[57]  Zeeshan Shafi Khan,et al.  Combining Offline and Online Preprocessing for Online Urdu Character Recognition , 2009 .

[58]  Adnan Amin,et al.  Recognition of printed Arabic text using neural networks , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[59]  Ching Y. Suen,et al.  Character Recognition Systems: A Guide for Students and Practitioners , 2007 .

[60]  N. Alvertos,et al.  Optical machine recognition of Greek characters of any size , 1989, Proceedings. IEEE Energy and Information Technologies in the Southeast'.

[61]  Quintin Gee,et al.  Implementation Challenges for Nastaliq Character Recognition , 2008, IMTIC.

[62]  Ching Y. Suen,et al.  Thinning Methodologies - A Comprehensive Survey , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Liying Zheng,et al.  Machine Printed Arabic Character Recognition Using S-GCM , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[64]  Abdullah Zawawi Talib,et al.  ISSUES AND CHALLENGES IN SINDHI OCR , 2014 .

[65]  Venu Govindaraju,et al.  Guide to OCR for Indic Scripts , 2010 .

[66]  Ching Y. Suen,et al.  A fast parallel algorithm for thinning digital patterns , 1984, CACM.

[67]  Abdel Belaïd,et al.  Multi-font Numerals Recognition for Urdu Script based Languages , 2009 .

[68]  Prachi Mukherji,et al.  Shape Feature and Fuzzy Logic Based Offline Devnagari Handwritten Optical Character Recognition , 2010 .

[69]  Alireza Alaei,et al.  A Baseline Dependent Approach for Persian Handwritten Character Segmentation , 2010, 2010 20th International Conference on Pattern Recognition.

[70]  Guo-hong Li,et al.  An approach to offline handwritten Chinese character recognition based on segment evaluation of adaptive duration , 2004, Journal of Zhejiang University. Science.

[71]  C. J. Hilditch Comparison of thinning algorithms on a parallel processor , 1983, Image Vis. Comput..

[72]  E. Shaddad,et al.  On the automatic reading of printed Arabic characters , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.

[73]  Apurva A. Desai,et al.  Gujarati handwritten numeral optical character reorganization through neural network , 2010, Pattern Recognit..

[74]  Sanjika Hewavitharana,et al.  A Two Stage Classification Approach to Tamil Handwriting Recognition , 2002 .

[75]  Sabri A. Mahmoud,et al.  Survey and bibliography of Arabic optical text recognition , 1995, Signal Process..

[76]  F. Haj-Hassan,et al.  Arabic character recognition , 1990 .

[77]  Premkumar Natarajan,et al.  The BBN Byblos Pashto OCR system , 2004, HDP '04.