A comparative study of different texture features for document image retrieval

Abstract Due to the rapid increase of different digitised documents, there has been significant attention dedicated to document image retrieval over the past two decades. Finding discriminative and effective features is a fundamental task for providing a fast and more accurate retrieval system. Texture features are generally fast to compute and are suitable for large volume data. Thus, in this study, the effectiveness of texture features widely used in the literature of content-based image retrieval is investigated on document images. Twenty-six different texture feature extraction methods from four main categories of texture features, statistical, transform, model, and structural-based approaches, are considered in this research work to compare their performance on the problem of document image retrieval. Three document image datasets, MTDB, ITESOFT, and CLEF_IP with various content and page layouts are used to evaluate the twenty-six texture-based features on document image retrieval systems. The retrieval results are computed in terms of precision, recall and F-score, and a comparative analysis of the results is also provided. Feature dimensions and time complexity of the texture-based feature methods are further compared. Finally, some conclusions are drawn and suggestions are made about future research directions.

[1]  A. Haar Zur Theorie der orthogonalen Funktionensysteme , 1910 .

[2]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jilin Li,et al.  Document Image Retrieval with Local Feature Sequences , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[4]  Lewis D. Griffin Mean, median and mode filtering of images , 2000, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[5]  Hanqing Lu,et al.  Face detection using improved LBP under Bayesian framework , 2004, Third International Conference on Image and Graphics (ICIG'04).

[6]  J. Macgregor,et al.  Image texture analysis: methods and comparisons , 2004 .

[7]  Loris Nanni,et al.  A local approach based on a Local Binary Patterns variant texture descriptor for classifying pain states , 2010, Expert Syst. Appl..

[8]  G. N. Srinivasan,et al.  Statistical Texture Analysis , 2008 .

[9]  Jesús Francisco Vargas-Bonilla,et al.  Off-line signature verification based on grey level information using texture features , 2011, Pattern Recognit..

[10]  Rama Chellappa,et al.  Classification of textures using Gaussian Markov random fields , 1985, IEEE Trans. Acoust. Speech Signal Process..

[11]  Alireza Alaei,et al.  Fast local binary pattern: Application to document image retrieval , 2017, 2017 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[12]  Gregory Beylkin,et al.  Discrete radon transform , 1987, IEEE Trans. Acoust. Speech Signal Process..

[13]  Prashant P. Bartakke,et al.  Texture representation using autoregressive models , 2009, 2009 International Conference on Advances in Computational Tools for Engineering Applications.

[14]  Jayant Kumar,et al.  Structural similarity for document image classification and retrieval , 2014, Pattern Recognit. Lett..

[15]  Josiane Zerubia,et al.  Unsupervised parallel image classification using Markovian models , 1999, Pattern Recognit..

[16]  Marcus Liwicki,et al.  Page Segmentation for Historical Handwritten Document Images Using Color and Texture Features , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[17]  Alireza Alaei,et al.  A brief review of document image retrieval methods: Recent advances , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[18]  E. O. Brigham,et al.  The Fast Fourier Transform , 1967, IEEE Transactions on Systems, Man, and Cybernetics.

[19]  Keinosuke Matsumoto,et al.  Document image retrieval based on 2D density distributions of terms with pseudo relevance feedback , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[20]  Tai Sing Lee,et al.  Image Representation Using 2D Gabor Wavelets , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Anil K. Jain,et al.  A Structural Approach To Identify Defects In Textured Images , 1988, Proceedings of the 1988 IEEE International Conference on Systems, Man, and Cybernetics.

[22]  Francesca Cesarini,et al.  Retrieval by Layout Similarity of Documents Represented with MXY Trees , 2002, Document Analysis Systems.

[23]  Xiaoyang Tan,et al.  Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions , 2007, IEEE Transactions on Image Processing.

[24]  Francesco Bianconi,et al.  Image classification with binary gradient contours , 2011 .

[25]  Renée Heilbronner,et al.  The autocorrelation function: an image processing tool for fabric analysis , 1992 .

[26]  Michal Strzelecki,et al.  Texture Analysis Methods - A Review , 1998 .

[27]  Umesh D. Dixit,et al.  Face-based Document Image Retrieval System , 2018 .

[28]  Pushmeet Kohli,et al.  Markov Random Fields for Vision and Image Processing , 2011 .

[29]  Hyun Wook Park,et al.  Statistical Textural Features for Detection of Microcalcifications in Digitized Mammograms , 1999, IEEE Trans. Medical Imaging.

[30]  Rémy Mullot,et al.  Learning Texture Features for Enhancement and Segmentation of Historical Document Images , 2015, HIP@ICDAR.

[31]  Alfred Haar,et al.  On the Theory of Orthogonal Function Systems , 2009 .

[32]  C. V. Jawahar,et al.  On Segmentation of Documents in Complex Scripts , 2007 .

[33]  Dimitris Maroulis,et al.  Transform Feature Extraction Scheme for Ultrasound Thyroid Texture Classification , 2010 .

[34]  Ernest Valveny,et al.  A kernel-based approach to document retrieval , 2010, DAS '10.

[35]  Paul Scheunders,et al.  Statistical texture characterization from discrete wavelet representations , 1999, IEEE Trans. Image Process..

[36]  Hamid Soltanian-Zadeh,et al.  Radon transform orientation estimation for rotation invariant texture analysis , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  T. J. Stonham,et al.  Texture image classification and segmentation using RANK-order clustering , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol. III. Conference C: Image, Speech and Signal Analysis,.

[38]  Minh N. Do,et al.  A New Contourlet Transform with Sharp Frequency Localization , 2006, 2006 International Conference on Image Processing.

[39]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[41]  Rémy Mullot,et al.  Texture feature benchmarking and evaluation for historical document image analysis , 2017, International Journal on Document Analysis and Recognition (IJDAR).

[42]  Loris Nanni,et al.  Local binary patterns variants as texture descriptors for medical image analysis , 2010, Artif. Intell. Medicine.

[43]  Xiaosheng Wu,et al.  An Effective Texture Spectrum Descriptor , 2009, 2009 Fifth International Conference on Information Assurance and Security.

[44]  Richard W. Conners,et al.  A Theoretical Comparison of Texture Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  M. M. H. Daisy,et al.  Gray scale morphological operations for image retrieval , 2012, 2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET).

[46]  Allan Hanbury,et al.  CLEF-IP 2011: Retrieval in the Intellectual Property Domain , 2011, CLEF.

[47]  David Zhang,et al.  Palmprint Identification by Fourier Transform , 2002, Int. J. Pattern Recognit. Artif. Intell..

[48]  Alireza Alaei,et al.  Document Image Retrieval Based on Texture Features: A Recognition-Free Approach , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[49]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[50]  Ben A. M. Schouten,et al.  Feature Extraction Using Fractal Codes , 1999, VISUAL.

[51]  Marko Heikkilä,et al.  Description of interest regions with local binary patterns , 2009, Pattern Recognit..

[52]  Giovanni Soda,et al.  Digital Libraries and Document Image Retrieval Techniques: A Survey , 2011, Learning Structure and Schemas from Documents.

[53]  Zhenhua Guo,et al.  A Completed Modeling of Local Binary Pattern Operator for Texture Classification , 2010, IEEE Transactions on Image Processing.

[54]  Alireza Alaei,et al.  Document image retrieval based on texture features and similarity fusion , 2016, 2016 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[55]  Robert A. Melter Some characterizations of city block distance , 1987, Pattern Recognit. Lett..

[56]  Weidong Yang,et al.  Target recognition by texture segmentation algorithm , 2016, Expert Syst. Appl..

[57]  Minh N. Do,et al.  Contourlets: a directional multiresolution image representation , 2002, Proceedings. International Conference on Image Processing.

[58]  Alireza Alaei,et al.  Logo and seal based administrative document image retrieval: A survey , 2016, Comput. Sci. Rev..

[59]  Umapada Pal,et al.  Local Binary Pattern for Word Spotting in Handwritten Historical Document , 2016, S+SSPR.

[60]  B. V. Dhandra,et al.  Gabor Wavelets Based Word Retrieval from Kannada Documents , 2016 .

[61]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[62]  Agma J. M. Traina,et al.  An Efficient Algorithm for Fractal Analysis of Textures , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.