Optical Character Recognition System for Urdu (Naskh Font) Using Pattern Matching Technique

The offline optical character recognition (OCR) for different languages has been developed over the recent years. Since 1965, the US postal service has been using this system for automating their services. The range of the applications under this area is increasing day by day, due to its utility in almost major areas of government as well as private sector. This technique has been very useful in making paper free environment in many major organizations as far as the backup of their previous file record is concerned. Our this system has been proposed for the Offline Character Recognition for Isolated Characters of Urdu language, as Urdu language forms words by combining Isolated Characters. Urdu is a cursive language, having connected characters making words. The major area of utility for Urdu OCR will be digitizing of a lot of literature related material already stocked in libraries. Urdu language is famous and spoken in more than 3 big countries including Pakistan, India and Bangladesh. A lot of work has been done in Urdu poetry and literature up to the recent century. Creation of OCR for Urdu language will make an important role in converting all those work from physical libraries to electronic libraries. Most of the stuff already placed on internet is in the form of images having text, which took a lot of space to transfer and even read online. So the need of an Urdu OCR is a must. The system is of training system type. It consists of the image preprocessing, line and character segmentation, creation of xml file for training purpose. While Recognition system includes taking xml file, the image to be recognized, segment it and creation of chain codes for character images and matching with already stored in xml file. Tabassam Nawaz, Syed Ammar Hassan Shah Naqvi, Habib ur Rehman & Anoshia Faiz International Journal of Image Processing, (IJIP)Volume (3) : Issue (3) 93 The system has been implemented and it has 89% recognition accuracy with a 15 char/sec recognition rate.

[1]  Irwin Edward Sobel,et al.  Camera Models and Machine Perception , 1970 .

[2]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[3]  Talaat S. El-Sheikh,et al.  Computer recognition of arabic cursive scripts , 1988, Pattern Recognit..

[4]  Thomas S. Huang,et al.  A fast two-dimensional median filtering algorithm , 1979 .

[5]  J L Horner Optical restoration of images blurred by atmospheric turbulence using optimum filter theory. , 1970, Applied optics.

[6]  U. Pal,et al.  Recognition of printed Urdu script , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[7]  Lawrence G. Roberts,et al.  Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.

[8]  Manfred H. Hueckel An Operator Which Locates Edges in Digitized Pictures , 1971, J. ACM.

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Tracy Hammond,et al.  Urdu Qaeda: Recognition System for Isolated Urdu Characters , 2009 .

[11]  Edward R. Dougherty,et al.  Hands-on Morphological Image Processing , 2003 .

[12]  G. Nagy,et al.  Chinese character recognition: a twenty-five-year retrospective , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[13]  Bhabatosh Chanda,et al.  A differentiation / enhancement edge detector and its properties , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[14]  V. Ganapathy,et al.  Optical Character Recognition Program for Images of Printed Text using a Neural Network , 2006, 2006 IEEE International Conference on Industrial Technology.

[15]  Tomaso A. Poggio,et al.  On Edge Detection , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  J. Horner Optical Spatial Filtering with the Least Mean-Square-Error Filter* , 1969 .

[17]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .

[18]  D. Sakrison,et al.  Computer enhancement of scanning electron micrographs , 1975 .

[19]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Mokhtar Sellami,et al.  Cursive Arabic Script Segmentation and Recognition System , 2005 .

[21]  Awais Adnan,et al.  OCR For Printed Urdu Script Using Feed Forward Neural Network , 2007 .

[22]  Borut Zalik,et al.  An efficient chain code with Huffman coding , 2005, Pattern Recognit..

[23]  Sudeep Sarkar,et al.  Comparison of Edge Detectors: A Methodology and Initial Study , 1998, Comput. Vis. Image Underst..

[24]  Sarmad Hussain,et al.  Urdu computing standards: Urdu Zabta Takhti (UZT) 1.01 , 2001, Proceedings. IEEE International Multi Topic Conference, 2001. IEEE INMIC 2001. Technology for the 21st Century..

[25]  C. Helstrom Image Restoration by the Method of Least Squares , 1967 .

[26]  Sudeep Sarkar,et al.  Robust Visual Method for Assessing the Relative Performance of Edge-Detection Algorithms , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Khalid Saeed Computer New Approaches for Cursive Languages Recognition : Machine and Hand Written Scripts and Texts , 2005 .

[28]  Amos Gilat,et al.  Matlab, An Introduction With Applications , 2003 .

[29]  Ramón M. Rodríguez-Dagnino,et al.  Efficiency of chain codes to represent binary objects , 2007, Pattern Recognit..

[30]  Jaakko Astola,et al.  On computation of the running median , 1989, IEEE Trans. Acoust. Speech Signal Process..

[31]  A. Rosenfeld,et al.  Techniques for edge detection , 1971 .

[32]  Dmitry B. Goldgof,et al.  Comparison of Edge Detector Performance through Use in an Object Recognition Task , 2001, Comput. Vis. Image Underst..

[33]  Z. A. Shah,et al.  Ligature based optical character recognition of Urdu- Nastaleeq font , 2002 .

[34]  Yoram Yakimovsky,et al.  Boundary and Object Detection in Real World Images , 1974, JACM.

[35]  D Marr,et al.  Theory of edge detection , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[36]  Tamar Peli,et al.  A study of edge detection algorithms , 1982, Comput. Graph. Image Process..

[37]  A. Venables,et al.  A "secondary" look at digital image processing , 2005, Technical Symposium on Computer Science Education.

[38]  B. Cyganek An Introduction to 3D Computer Vision Techniques and Algorithms , 2009 .

[39]  Djemel Ziou,et al.  Edge Detection Techniques-An Overview , 1998 .

[41]  John D. Worth,et al.  A Modern Approach , 2005 .

[42]  Scott E. Umbaugh,et al.  Computer Imaging: Digital Image Analysis and Processing , 2005 .

[43]  J. Galayda Edge Focusing , 1981, IEEE Transactions on Nuclear Science.