Texture for script identification

The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.

[1]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[3]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[4]  Josef Kittler,et al.  Minimum error thresholding , 1986, Pattern Recognit..

[5]  Rohan A. Baxter,et al.  MML and Bayesianism: similarities and differences: introduction to minimum encoding inference Part , 1994 .

[6]  Chin-Hui Lee,et al.  Bayesian Adaptive Learning and Map Estimation of HMM , 1996 .

[7]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Sargur N. Srihari,et al.  An object attribute thresholding algorithm for document image binarization , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[9]  Béla Julesz,et al.  Visual Pattern Discrimination , 1962, IRE Trans. Inf. Theory.

[10]  Yung-Sheng Chen,et al.  Adaptive thresholding algorithm and its hardware implementation , 1994, Pattern Recognit. Lett..

[11]  Sridha Sridharan,et al.  Logarithmic quantisation of wavelet coefficients for improved texture classification performance , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Henry S. Baird,et al.  Language identification in Complex, Unoriented, and Degraded Document Images , 1996, DAS.

[13]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[14]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[15]  Paul Scheunders,et al.  Statistical texture characterization from discrete wavelet representations , 1999, IEEE Trans. Image Process..

[16]  Tieniu Tan,et al.  Script and Language Identification from Document Images , 1997, BMVC.

[17]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[18]  Tieniu Tan,et al.  Rotation Invariant Texture Features and Their Use in Automatic Script Identification , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Patrick Kelly,et al.  Automatic Script Identification From Document Images Using Cluster-Based Templates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Sargur N. Srihari,et al.  Document Image Binarization Based on Texture Features , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Pietro Perona,et al.  Rotation invariant texture recognition using a steerable pyramid , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[22]  Biing-Hwang Juang,et al.  A study on speaker adaptation of the parameters of continuous density hidden Markov models , 1991, IEEE Trans. Signal Process..

[23]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[24]  S.K. Rogers,et al.  A new algorithm for detecting the optimal number of substructures in the data , 1997, Proceedings of the IEEE 1997 National Aerospace and Electronics Conference. NAECON 1997.

[25]  Bidyut Baran Chaudhuri,et al.  Skew Angle Detection of Digitized Indian Script Documents , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Bidyut Baran Chaudhuri,et al.  Automatic identification of English, Chinese, Arabic, Devnagari and Bangla script line , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[27]  Majid Mirmehdi,et al.  Combining statistical measures to find image text regions , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[28]  Wageeh W. Boles,et al.  Texture classification using wavelet scale relationships , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  Sridha Sridharan,et al.  An Accurate Method for Skew Determination in Document Images , 2002 .

[30]  Patrick Kelly,et al.  Automatic script identification from images using cluster-based templates , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[31]  Mausumi Acharyya,et al.  Document image segmentation using wavelet scale-space features , 2002, IEEE Trans. Circuits Syst. Video Technol..

[32]  Anil K. Jain,et al.  Automatic image orientation detection , 2002, IEEE Trans. Image Process..

[33]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[34]  C.-C. Jay Kuo,et al.  Texture segmentation with tree-structured wavelet transform , 1992, [1992] Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis.

[35]  Douglas A. Reynolds,et al.  Comparison of background normalization methods for text-independent speaker verification , 1997, EUROSPEECH.

[36]  Ingrid Daubechies,et al.  The wavelet transform, time-frequency localization and signal analysis , 1990, IEEE Trans. Inf. Theory.

[37]  Michael Unser,et al.  A family of polynomial spline wavelet transforms , 1993, Signal Process..

[38]  Tieniu Tan,et al.  A general algorithm for document skew angle estimation , 1997, Proceedings of International Conference on Image Processing.

[39]  David G. Stork,et al.  Pattern Classification , 1973 .

[40]  Yuan Yan Tang,et al.  Text area localization under complex-background using wavelet decomposition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[41]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[42]  Ching Y. Suen,et al.  Categorizing Document Images into Script and Language Classes , 1999 .

[43]  Henry S. Baird,et al.  The skew angle of printed documents , 1995 .

[44]  Christian Ronse,et al.  Book-Review - Connected Components in Binary Images - the Detection Problem , 1984 .

[45]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[46]  Stéphane Mallat,et al.  Zero-crossings of a wavelet transform , 1991, IEEE Trans. Inf. Theory.

[47]  A. Lawrence Spitz,et al.  Determination of the Script and Language Content of Document Images , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[49]  Kyung-Whan Oh,et al.  A validity measure for fuzzy clustering and its use in selecting optimal number of clusters , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[50]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..