Flowchart recognition for non-textual information retrieval in patent search

Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.

[1]  Allan Hanbury,et al.  Classifying Patent Images , 2011, CLEF.

[2]  Wioleta Szwoch Recognition, Understanding and Aestheticization of Freehand Drawing Flowcharts , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[3]  Jane List,et al.  How drawings could enhance retrieval in mechanical and device patent searching , 2007 .

[4]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[5]  Emanuele Pianta,et al.  Integration of Semantic, Metadata and Image Search Engines with a Text Search Engine for Patent Retrieval , 2008, SemSearch.

[6]  Laurent Romary,et al.  Textual Summarisation of Flowcharts in Patent Drawings for CLEF-IP 2012 , 2012, CLEF.

[7]  Allan Hanbury,et al.  CLEF-IP 2011: Retrieval in the Intellectual Property Domain , 2011, CLEF.

[8]  Miro Kraetzl,et al.  Graph distances using graph union , 2001, Pattern Recognit. Lett..

[9]  Sergio Escalera,et al.  Blurred Shape Model for binary and grey-level symbol recognition , 2009, Pattern Recognit. Lett..

[10]  Jamshid Shanbehzadeh,et al.  Image retrieval based on shape similarity by edge orientation autocorrelogram , 2003, Pattern Recognit..

[11]  Liang Zhang,et al.  A Novel Pen-Based Flowchart Recognition System for Programming Teaching , 2008, WBL.

[12]  Symeon Papadopoulos,et al.  Towards content-based patent image retrieval: A framework perspective , 2010 .

[13]  Gabriela Csurka,et al.  XRCE's Participation at Patent Image Classification and Image-based Patent Retrieval Tasks of the Clef-IP 2011 , 2011, CLEF.

[14]  Ioannis Kompatsiaris,et al.  Concept-based patent image retrieval , 2012 .

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  Allan Hanbury,et al.  Patent image retrieval: a survey , 2011, PaIR '11.

[17]  John Tait,et al.  CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain , 2009, CLEF.

[18]  Bintu G. Vasudevan,et al.  Flowchart knowledge extraction on image processing , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[19]  Yiannis Kompatsiaris,et al.  Content-based binary image retrieval using the adaptive hierarchical density histogram , 2011, Pattern Recognit..

[20]  Stephen Adams Electronic non-text material in patent applications—some questions for patent offices, applicants and searchers , 2005 .

[21]  Veena Bansal,et al.  PATSEEK: Content Based Image Retrieval System for Patent Database , 2004, ICEB.

[22]  Geoff A. W. West,et al.  Segmentation of edges into lines and arcs , 1989, Image Vis. Comput..

[23]  Ernest Valveny,et al.  Scan-to-XML: automatic generation of browsable technical documents , 2002, Object recognition supported by user interaction for service robots.

[24]  Dorothea Blostein General Diagram-Recognition Methodologies , 1995, GREC.

[25]  Michihiko Minoh,et al.  Efficient diagram understanding with characteristic pattern detection , 1985, Comput. Vis. Graph. Image Process..

[26]  Salvatore Tabbone,et al.  Stable and Robust Vectorization: How to Make the Right Choices , 1999, GREC.

[27]  Josep Lladós,et al.  CVC-UAB's Participation in the Flowchart Recognition Task of CLEF-IP 2012 , 2012, CLEF.

[28]  David G. Stork,et al.  Pattern Classification , 1973 .

[29]  Horst Bunke Attributed Programmed Graph Grammars and Their Application to Schematic Diagram Interpretation , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Horst Bunke,et al.  An optimal algorithm for extracting the regions of a plane graph , 1993, Pattern Recognit. Lett..

[31]  Bart Lamiroy,et al.  Scan-to-XML for Vector Graphics: an experimental setup for intelligent browsable document generation , 2001 .

[32]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Allan Hanbury,et al.  Image search in patents: a review , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[34]  Guojun Lu,et al.  A Comparative Study of Three Region Shape Descriptors , 2001 .

[35]  Bernard Mérialdo,et al.  Relational skeletons for retrieval in patent drawings , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[36]  Bart Lamiroy,et al.  Text/Graphics Separation Revisited , 2002, Document Analysis Systems.

[37]  Allan Hanbury,et al.  Patent images - a glass-encased tool: opening the case , 2012, i-KNOW '12.

[38]  Guojun Lu,et al.  Review of shape representation and description techniques , 2004, Pattern Recognit..

[39]  Igor V. Filippov,et al.  CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy, September 17-20, 2012 , 2014, CLEF.

[40]  Roland Mörzinger,et al.  Visual Structure Analysis of Flow Charts in Patent Images , 2012, CLEF.

[41]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[42]  David Doermann,et al.  Handbook of Document Image Processing and Recognition , 2014, Springer London.

[43]  Ashok Samal,et al.  A system for recognizing a large class of engineering drawings , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[44]  Hanan Samet,et al.  Storing a collection of polygons using quadtrees , 1985, TOGS.