A system towards Indian postal automation

In this paper, we present a system towards Indian postal automation. In the proposed system, at first, using run length smoothing algorithm (RLSA), we decompose the image into blocks. Based on the black pixel density and number of components inside a block, non-text block (postal stamp, postal seal etc.) are detected. Using positional information, the destination address block (DAB) is identified from text block. Next, pin-code box from the DAB is detected and numerals from the pin-code box are extracted. Since India is a multi-lingual and multi-script country, the address part may be written by combination of two languages: Arabic and a local language. For the sorting of postal documents written in Arabic and a local language Bangla, a two-stage MLP based classifier is employed to recognise Bangla and Arabic numerals. At present, the accuracy of the handwritten numeral recognition module is 92.10%.

[1]  Xian Wang,et al.  A new method of character line extraction from mixed-unformatted document image for Japanese mail address recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[2]  Bidyut Baran Chaudhuri,et al.  A complete printed Bangla OCR system , 1998, Pattern Recognit..

[3]  Venu Govindaraju,et al.  Postal reply card processing , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[4]  Gyeonghwan Kim,et al.  Handwritten phrase recognition as applied to street name images , 1998, Pattern Recognit..

[5]  Michael T. Manry,et al.  Comparisons of a neural network and a nearest-neighbor classifier via the numeric handprint recognition problem , 1995, IEEE Trans. Neural Networks.

[6]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Sargur N. Srihari,et al.  Integration of hand-written address interpretation technology into the United States Postal Service Remote Computer Reader system , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[8]  Atsushi Sato,et al.  Robust Frame Extraction and Removal for Processing Form Documents , 2001, GREC.

[9]  Hiroshi Sako,et al.  Handwritten digit recognition: investigation of normalization and feature extraction techniques , 2004, Pattern Recognit..

[10]  Friedrich M. Wahl,et al.  Block segmentation and text extraction in mixed text/image documents , 1982, Comput. Graph. Image Process..

[11]  Sargur N. Srihari,et al.  Document Image Binarization: Evaluation Of Algorithms , 1986, Optics & Photonics.

[12]  Sargur N. Srihari,et al.  Parsing and recognition of city, state, and ZIP codes in handwritten addresses , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).