Scene Character Reconstruction through Medial Axis

Character shape reconstruction for the scene character is challenging and interesting because scene character usually suffers from uneven illumination, complex background, perspective distortion. To address such ill conditions, we propose to utilize Histogram Gradient Division (HGD) and Reverse Gradient Orientation (RGO) to select Candidate Text Pixels (CTPs) for a given input character. Ring Radius Transform is applied on each pixel in a CTP image to obtain radius map where each pixel is assigned a value which is the radius to the nearest CTP. Candidate medial axis pixels are those having maximum radius values in their neighborhoods. We find such pixels on horizontal, vertical, principal diagonal and secondary diagonal directions to determine the respective medial axis pixels. The union of all medial axis pixels at each pixel location is considered as a candidate medial axis pixel of the character. Then color difference and k-means clustering are employed to eliminate false candidate medial axis. The potential medial axis values are used to reconstruct the shape of the character. The method is tested on 1025 characters of complex foreground and background from ICDAR 2003 dataset in terms of shape reconstruction and recognition rate. Experimental results demonstrate the effectiveness of our proposed method for complex foreground and background characters in terms of character recognition rate and reconstruction error.

[1]  Shijian Lu,et al.  Binarization of historical document images using the local maximum and minimum , 2010, DAS '10.

[2]  C. V. Jawahar,et al.  An MRF Model for Binarization of Natural Scene Text , 2011, 2011 International Conference on Document Analysis and Recognition.

[3]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[5]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[6]  Deepak Kumar,et al.  Benchmarking recognition results on word image datasets , 2012, ArXiv.

[7]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  Shijian Lu,et al.  New Spatial-Gradient-Features for Video Script Identification , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[10]  Palaiahnakote Shivakumara,et al.  A novel ring radius transform for video character reconstruction , 2013, Pattern Recognit..

[11]  Palaiahnakote Shivakumara,et al.  A new Iterative-Midpoint-Method for video character gap filling , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[12]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[13]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..

[14]  Toru Wakahara,et al.  Binarization of Color Characters in Scene Images Using k-means Clustering and Support Vector Machines , 2010, 2010 20th International Conference on Pattern Recognition.