Figure Retrieval in Biomedical Literature

Automatic classification of figures present in healthcare documents is known to be useful for biomedical document mining. The context of a document is directly reflected in the figures present within them. Embedded text within these figures along with image features have been used for figure retrieval. We demonstrate that image features based on structural properties of figures alone is sufficient for the figure retrieval task. An algorithm for describing structural properties of the embedded images, Fourier Edge Orientation Autocorrelogram, which utilizes spatial distribution of detected edges, is presented. We have shown that Fourier Edge Orientation Autocorrelogram performs better than its predecessor, when most of the edge information is retained. The algorithm is validated on publicly available figures from healthcare literature. Apart from invariance to scale, rotation and non-uniform illumination, the proposed feature descriptor is also shown to be relatively robust to noisy edges. Since there is no standard dataset available for figure classification, comparison of the proposed feature descriptor with four well known binary shape descriptors is demonstrated. The retrieval performance shows an overall improvement over other known methods in figure retrieval task.

[1]  ZhangLei,et al.  Canny Edge Detection Enhancement by Scale Multiplication , 2005 .

[2]  William R. Hersh,et al.  Model Formulation: A Model for Enhancing Internet Medical Document Retrieval with "Medical Core Metadata" , 1999, J. Am. Medical Informatics Assoc..

[3]  K. Sai Deepak,et al.  An Effective Edge Detection Methodology for Medical Images Based on Texture Discrimination , 2009, 2009 Seventh International Conference on Advances in Pattern Recognition.

[4]  C. Beleznai,et al.  Road Sign Detection from Edge Orientation Histograms , 2007, 2007 IEEE Intelligent Vehicles Symposium.

[5]  Salvatore Tabbone,et al.  Histogram of radon transform. A useful descriptor for shape retrieval , 2008, 2008 19th International Conference on Pattern Recognition.

[6]  XuSonghua,et al.  Yale Image Finder (YIF) , 2008 .

[7]  Jitendra Malik,et al.  Efficient shape matching using shape contexts , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Guojun Lu,et al.  Review of shape representation and description techniques , 2004, Pattern Recognit..

[9]  MalikJitendra,et al.  Efficient Shape Matching Using Shape Contexts , 2005 .

[10]  Lei Zhang,et al.  Canny edge detection enhancement by scale multiplication , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Hong Yu,et al.  Figure Text Extraction in Biomedical Literature , 2011, PloS one.

[12]  P. Radha Krishna,et al.  An efficient shape based feature for retrieval of healthcare literatures using CBIR technique , 2011, Bangalore Compute Conf..

[13]  Raul Rodriguez-Esteban,et al.  Figure mining for biomedical research , 2009, Bioinform..

[14]  Michael Krauthammer,et al.  Yale Image Finder (YIF): a new search engine for retrieving biomedical images , 2008, Bioinform..

[15]  Veena Bansal,et al.  PATSEEK: Content Based Image Retrieval System for Patent Database , 2004, ICEB.

[16]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Hagit Shatkay,et al.  Integrating image data into biomedical text categorization , 2006, ISMB.

[18]  Jamshid Shanbehzadeh,et al.  Image retrieval based on shape similarity by edge orientation autocorrelogram , 2003, Pattern Recognit..

[19]  Laurent Wendling,et al.  Technical symbols recognition using the two-dimensional Radon transform , 2002, Object recognition supported by user interaction for service robots.