Automatic categorization of figures in scientific documents

Figures are very important non-textual information contained in scientific documents. Current digital libraries do not provide users tools to retrieve documents based on the information available within the figures. We propose architecture for retrieving documents by integrating figures and other information. The initial step in enabling integrated document search is to categorize figures into a set of pre-defined types. We propose several categories of figures based on their functionalities in scholarly articles. We have developed a machine-learning-based approach for automatic categorization of figures. Both global features, such as texture, and part features, such as lines, are utilized in the architecture for discriminating among figure categories. The proposed approach has been evaluated on a testbed document set collected from the CiteSeer scientific literature digital library. Experimental evaluation has demonstrated that our algorithms can produce acceptable results for real- world use. Our tools can be integrated into a scientific-document digital library

[1]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[2]  James Ze Wang,et al.  Content-based image retrieval: approaches and trends of the new age , 2005, MIR '05.

[3]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.

[4]  David S. Doermann,et al.  The Indexing and Retrieval of Document Images: A Survey , 1998, Comput. Vis. Image Underst..

[5]  Byung-Won On,et al.  Comparative study of name disambiguation problem using a scalable blocking-based framework , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[6]  James Ze Wang,et al.  SST: an algorithm for finding near-exact sequence matches in time proportional to the logarithm of the database size , 2002, Bioinform..

[7]  Edward Lank,et al.  Treatment of Diagrams in Document Image Analysis , 2000, Diagrams.

[8]  Brian D. Davison,et al.  Robust document image understanding technologies , 2004, HDP '04.

[9]  Richard O. Duda,et al.  Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.

[10]  Jelena Kovacevic,et al.  Wavelets and Subband Coding , 2013, Prentice Hall Signal Processing Series.

[11]  Sudeep Sarkar,et al.  Comparison of Edge Detectors: A Methodology and Initial Study , 1998, Comput. Vis. Image Underst..

[12]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[13]  Emanuele Trucco,et al.  Introductory techniques for 3-D computer vision , 1998 .

[14]  Sargur N. Srihari,et al.  Knowledge-based derivation of document logical structure , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[15]  Azriel Rosenfeld,et al.  Document structure analysis algorithms: a literature survey , 2003, IS&T/SPIE Electronic Imaging.

[16]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Hsin-Min Wang,et al.  On the extraction of vocal-related information to facilitate the management of popular music collections , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[18]  Michael G. Christel,et al.  Addressing the challenge of visual information access from digital image and video libraries , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[19]  Mor Naaman,et al.  Leveraging context to resolve identity in photo albums , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[20]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[22]  Edward A. Fox,et al.  Automatic document metadata extraction using support vector machines , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[23]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Qinghua Zheng,et al.  Automatic extraction of titles from general documents using machine learning , 2006, Inf. Process. Manag..

[25]  James Ze Wang,et al.  A computationally efficient approach to the estimation of two- and three-dimensional hidden Markov models , 2006, IEEE Transactions on Image Processing.

[26]  Sudeep Sarkar,et al.  Comparison of edge detectors: a methodology and initial study , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Hui Han,et al.  Name disambiguation in author citations using a K-way spectral clustering method , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[28]  Yixin Chen,et al.  A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  David S. Doermann,et al.  A parallel-line detection algorithm based on HMM decoding , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.