A Chinese Character Segmentation Algorithm for Complicated Printed Documents

The character segmentation technology for printed documents plays an important role in optical character recognition, ticket information identification, postal code identification, automatic license plate recognition and so on. In this paper, a Chinese characters segmentation algorithm for complicated printed documents is proposed for the application in paper watermarking system. In this application, the algorithm aims to achieve high accuracy Chinese character segmentation and high consistent segmentation between the digital version images and print-scanned version images for the same documents. In this method, three main steps are included: connected regions recognition, connected regions merging, and finegained segmentation. Experiments show the effectiveness of the proposed algorithm.

[1]  Ehsanollah Kabir,et al.  A new segmentation technique for omnifont Farsi text , 2001, Pattern Recognit. Lett..

[2]  Liu Fang Study on printed Tibetan character recognition , 2009 .

[3]  Fa-Liang Chang,et al.  Automatic License-Plate Location and Recognition Based on Feature Salience , 2009, IEEE Transactions on Vehicular Technology.

[4]  A. Çapar,et al.  License Plate Recognition From Still Images and Video Sequences: A Survey , 2008, IEEE Transactions on Intelligent Transportation Systems.

[5]  D. Senapati,et al.  A novel approach to text line and word segmentation on odia printed documents , 2012, 2012 Third International Conference on Computing, Communication and Networking Technologies (ICCCNT'12).

[6]  Zhao Rongchun An Optimal Character Segmentation Algorithm Based on Connected Component Recognition , 2006 .

[7]  Guo Jian Approach to Segment Multi-Size Machine Printed Characters by Removing Serifs , 2006 .

[8]  Lu Jianfeng Segmentation and Recognition of Printed Character in Check Image , 2003 .

[9]  Xianglong Tang,et al.  A new algorithm for machine printed Arabic character segmentation , 2004, Pattern Recognit. Lett..

[10]  Liu Meng License Plate Character Segmentation Based on Differencing Projection and Preferably Segmented Character , 2008 .

[11]  Toshikazu Wada,et al.  Classification based character segmentation guided by Fast-Hessian-Affine regions , 2011, The First Asian Conference on Pattern Recognition.

[12]  Nicole Vincent,et al.  Word spotting in historical printed documents using shape and sequence comparisons , 2012, Pattern Recognit..

[13]  Sankar K. Pal,et al.  International Journal of Signal Processing , Image Processing and Pattern Recognition , 2008 .

[14]  Jing-Ming Guo,et al.  License Plate Localization and Character Segmentation with feedback self-learning and hybrid-binarization techniques , 2008, TENCON 2007 - 2007 IEEE Region 10 Conference.

[15]  Nikos A. Nikolaou,et al.  Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths , 2010, Image Vis. Comput..

[16]  Dai Shi-jie Research on method of train's code image segmentation based on feature of space and the vertical projection , 2011 .

[17]  S.N. Srihari,et al.  Machine printed character segmentation method using side profiles , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).