A new mixed binarization method used in a real time application of automatic business document and postal mail sorting

T he binarization is applied in the first stage of segmentation process and has a very strong impact on the performances of the system of the automatic sorting of company documents and mail. We present in the beginning of this paper a complete study of the different existing binarization mechanisms that are developed to meet the needs of specific applications. These conventional approaches, present weaknesses that it is crucial to overcome and unfortunately they remain unsuitable for our real time application. The separation between the thresholding and the text zones location stages considerably increase the computation time and lead to an oversegmentation of the noise and of the paper texture on empty zones of the image. Indeed, none of the traditional methods (whether global or local) efficiently meets all the required conditions. We have managed to optimize this stage by applying a local threshold only near the text zones that can be located by the cumulated gradients method with the multiresolution and mathematical morphology. We demonstrate the consistent performance of the proposed method on several types of business documents and mail with wideranging content and image quality.

[1]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Giovanni Ramponi,et al.  Enhancing document images with a quadratic filter , 1993, Signal Process..

[3]  R. Kohler A segmentation system based on thresholding , 1981 .

[4]  Frank Lebourgeois,et al.  Document analysis in gray level and typography extraction using character pattern redundancies , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[5]  U. Giani,et al.  Automatic Multilevel Thresholding Based on a Fuzzy Entropy Measure , 2011 .

[6]  W. Guitang,et al.  A new method for image segmentation , 2009, 2009 Asia-Pacific Conference on Computational Intelligence and Industrial Applications (PACIIA).

[7]  Mei Xie,et al.  A Novel Binarization Approach for License Plate , 2006, 2006 1ST IEEE Conference on Industrial Electronics and Applications.

[8]  Jean-Michel Jolion,et al.  Text localization, enhancement and binarization in multimedia documents , 2002, Object recognition supported by user interaction for service robots.

[9]  V. John Mathews,et al.  Adaptive, quadratic preprocessing of document images for binarization , 1998, IEEE Trans. Image Process..

[10]  Yao-Hong Tsai A New Approach for Image Thresholding under Uneven Lighting Conditions , 2007, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007).

[11]  B. Gosselin,et al.  Combination of binarization and character segmentation using color information , 2004, Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004..

[12]  Bülent Sankur,et al.  Selection of thresholding methods for nondestructive testing applications , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[13]  Andy C. Downton,et al.  A comparison of binarization methods for historical archive documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[14]  Humberto Bustince,et al.  A-IFSs Entropy Based Image Multi-thresholding , 2010, IEA/AIE.

[15]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[16]  Mamoru Tanaka,et al.  An image binarization system for composite pictures , 1992, [Proceedings] 1992 IEEE International Symposium on Circuits and Systems.

[17]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[19]  K. W. Wong,et al.  A two-stage binarization approach for document images , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[20]  Frédéric Bouchara,et al.  Document Image Binarisation Using Markov Field Model , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[21]  Y. J. Zhang,et al.  A survey on evaluation methods for image segmentation , 1996, Pattern Recognit..

[22]  Yap-Peng Tan,et al.  Adaptive binarization method for document image analysis , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[23]  Abdel Belaïd,et al.  Neural based binarization techniques , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[24]  Véronique Eglin,et al.  Robust approach of address block localization in business mail by graph coloring , 2009, Int. Arab J. Inf. Technol..

[25]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[26]  Adnan Amin,et al.  Automatic thresholding of gray-level using multistage approach , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[27]  Jorge Sánchez Valverde,et al.  Optimum binarization of technical document images , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[28]  Maurizio Vichi,et al.  Studies in Classification Data Analysis and knowledge Organization , 2011 .

[29]  Nam Ik Cho,et al.  MAP-MRF approach for binarization of degraded document image , 2008, 2008 15th IEEE International Conference on Image Processing.

[30]  Márcio Portes de Albuquerque,et al.  Nonextensive entropic image thresholding , 2002, Proceedings. XV Brazilian Symposium on Computer Graphics and Image Processing.

[31]  Hiroshi Tanaka,et al.  Threshold Correction of Document Image Binarization for Ruled-line Extraction , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[32]  H. Kamada,et al.  High-speed, high-accuracy binarization method for recognizing text in images of low spatial resolutions , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[33]  Øivind Due Trier,et al.  Improvement of "integrated function algorithm" for binarization of document images , 1995, Pattern Recognit. Lett..

[34]  Walter D. Fisher On Grouping for Maximum Homogeneity , 1958 .

[35]  P. Fontanot,et al.  A polynomial filter for the preprocessing of mail address images , 1993, IEEE Winter Workshop on Nonlinear Digital Signal Processing.

[36]  Nikos Papamarkos,et al.  A system for document binarization , 2003, 3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the.

[37]  Moon-Soo Chang,et al.  Improved binarization algorithm for document image by histogram and edge detection , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[38]  P. Bolon,et al.  Analyse d'images: filtrage et segmentation , 1995 .

[39]  Andreas E. Savakis,et al.  Adaptive document image thresholding using foreground and background clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[40]  张吟,et al.  Contact and Friction of One- and Two-Dimensional Nanostructures , 2013 .