Standardizing, Segmenting and Tenderizing Letters and Improving the Quality of Envelope Images to Extract Postal Addresses

In most mechanized postal systems, envelopes are scanned based on the postal standard using mechanical instruments. In the standard format, the image of envelopes lacks tilts, lines are along the horizontal axis and words are placed in a correct and non-oblique manner. In this article a new algorithm for rotating, segmentation and Tenderizing Letters for standardizing and increasing the quality of an envelope has been presented, which can be used in all text identification systems as three successful pre-processing algorithms. In the algorithm proposed, letters with any forms and tilts during scanning were rotated and standardized by applying a simple two-step algorithm based on what was written on the envelope without requiring the calculation of tilt angle. After standardization, the main regions of the image were specified using the histogram information. Then, in a simple algorithm, the candidate points from the pixels related to the text on the envelope were selected and quality improvement and tenderization were done on the main regions of the image. The advantaged of the proposed algorithm included No need for additional mechanical equipment, less calculation, simplicity and consideration of the structure of words on the envelope in all preprocessing phases. DOI: http://dx.doi.org/10.11591/ijece.v2i3.342 Full Text: PDF

[1]  Alessandro Vinciarelli,et al.  A survey on off-line Cursive Word Recognition , 2002, Pattern Recognit..

[2]  Eric L'Homer,et al.  Extraction of strokes in handwritten characters , 2000, Pattern Recognit..

[3]  Karim Faez,et al.  Novel Statistical Description for the Structure of Isolated Farsi / Arabic Handwritten Characters , 2008 .

[4]  S.B. Shouraki,et al.  A Novel Fuzzy Classifier using Fuzzy LVQ to Recognize Online Persian Handwriting , 2006, 2006 2nd International Conference on Information & Communication Technologies.

[5]  Hussein Almuallim,et al.  A Method of Recognition of Arabic Cursive Handwriting , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ching Y. Suen,et al.  Isolated Handwritten Farsi Numerals Recognition Using Sparse and Over-Complete Representations , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[7]  Ehsanollah Kabir,et al.  A new segmentation technique for omnifont Farsi text , 2001, Pattern Recognit. Lett..

[8]  Bidyut Baran Chaudhuri,et al.  Indian script character recognition: a survey , 2004, Pattern Recognit..

[9]  Volker Märgner,et al.  SARAT-a system for the recognition of Arabic printed text , 1992, ICPR.