论文信息 - Softconverter: A novel approach to construct OCR for printed Urdu isolated characters

Softconverter: A novel approach to construct OCR for printed Urdu isolated characters

Urdu covers the large part of sub-continent's (Pakistan, India, & Bangladesh) literature, which is present in hard form. It is difficult to change, distribute information present in hard form than soft form. Soft form can be uploaded on internet, and can be edited and reprinted. The problem of converting books containing Urdu characters to soft form can be done with Urdu OCR (Optical Character Recognition). NN (Neural network) is used to constructs OCR, but it makes the development of OCR very difficult and complex, even if the font size and font style is fixed. In this paper we present a simple and easy way to construct OCR for isolated characters of Urdu language (or right to left writing) called “softconverter” with the help of database, without using neural network. This paper proves that OCR can be implemented without using NN. Our prototype of Softconverter has accuracy rate of 97.43%.

[1] Abdelmalek Zidouri,et al. ORAN SYSTEM: A BASIS FOR AN ARABIC OCR , 2006 .

[2] U. Pal,et al. Recognition of printed Urdu script , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[3] Sargur N. Srihari,et al. Optical Character Recognition (OCR) , 2018, Encyclopedia of Image Processing.

[4] Awais Adnan,et al. OCR For Printed Urdu Script Using Feed Forward Neural Network , 2007 .