Concept-based patent image retrieval

Abstract Recently, the intellectual property and information retrieval communities have shown increasing interest in patent image retrieval, which could further enhance the current practices of patent search. In this context, this article presents an approach for automatically extracting concept information describing the patent image content to support searchers during patent retrieval tasks. The proposed approach is based on a supervised machine learning framework, which relies upon image and text analysis techniques. Specifically, we extract textual and visual low-level features from patent images and train detectors, which are capable of identifying global concepts in patent figures. To evaluate this approach we have selected a dataset from the footwear domain and trained the concept detectors with different feature combinations. The results of the experiments show that the combination of textual and visual information of patent images demonstrates the best performance outperforming both single visual and textual features results. The outcome of this experiment provides a first evidence that concept detection can be applied in the domain of patent image retrieval and could be integrated in existing real world applications to support patent searching.

[1]  Bart Thomee,et al.  New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative , 2010, MIR '10.

[2]  Marcel Worring,et al.  Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[3]  Rong Yan,et al.  Recent developments in content-based and concept-based image/video retrieval , 2008, ACM Multimedia.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[6]  Yiannis Kompatsiaris,et al.  High-level event detection in video exploiting discriminant concepts , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  Michael Blackman 2nd Information retrieval facility conference (IRFC) and 4th symposium (IRFS), Vienna, Austria, June 2011 , 2011 .

[8]  Allan Hanbury,et al.  Classifying Patent Images , 2011, CLEF.

[9]  Jane List,et al.  How drawings could enhance retrieval in mechanical and device patent searching , 2007 .

[10]  Emanuele Pianta,et al.  Integration of Semantic, Metadata and Image Search Engines with a Text Search Engine for Patent Retrieval , 2008, SemSearch.

[11]  Stephen Adams Electronic non-text material in patent applications—some questions for patent offices, applicants and searchers , 2005 .

[12]  ScienceDirect World patent information , 1979 .

[13]  Christos Diou,et al.  Reliability and effectiveness of clickthrough data for automatic image annotation , 2010, Multimedia Tools and Applications.

[14]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[15]  Yiannis Kompatsiaris,et al.  Content-based binary image retrieval using the adaptive hierarchical density histogram , 2011, Pattern Recognit..

[16]  Chong-Wah Ngo,et al.  Concept-Driven Multi-Modality Fusion for Video Search , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Xu Bin,et al.  An Outward-Appearance Patent-Image Retrieval Approach Based on the Contour-Description Matrix , 2007, 2007 Japan-China Joint Workshop on Frontier of Computer Science and Technology (FCST 2007).

[18]  Babu M. Mehtre,et al.  Content-based retrieval for trademark registration , 1996, Multimedia Tools and Applications.

[19]  David Newton Information Retrieval Facility Symposium (IRFS), Vienna, Austria, November 2008 , 2009 .

[20]  Yiannis Kompatsiaris,et al.  A Modular Framework for Ontology-based Representation of Patent Information , 2007, JURIX.

[21]  Veena Bansal,et al.  PATSEEK: Content Based Image Retrieval System for Patent Database , 2004, ICEB.

[22]  John P. Eakins Trademark Image Retrieval , 2001, Principles of Visual Information Retrieval.

[23]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[25]  Nicu Sebe,et al.  The State of the Art in Image and Video Retrieval , 2003, CIVR.

[26]  Symeon Papadopoulos,et al.  Towards content-based patent image retrieval: A framework perspective , 2010 .

[27]  Gabriela Csurka,et al.  XRCE's Participation at Patent Image Classification and Image-based Patent Retrieval Tasks of the Clef-IP 2011 , 2011, CLEF.