Applying key phrase extraction to aid invalidity search

Invalidity search poses different challenges when compared to conventional Information Retrieval problems. Presently, the success of invalidity search relies on the queries created from a patent application by the patent examiner. Since a lot of time is spent in constructing relevant queries, automatically creating them from a patent would save the examiner a lot of effort. In this paper, we address the problem of automatically creating queries from an input patent. An optimal query can be formed by extracting important keywords or phrases from a patent by using Key Phrase Extraction (KPE) techniques. Several KPE algorithms have been proposed in the literature but their performance on query construction for patents has not yet been explored. We systematically evaluate and analyze the performance of queries created by using state-of-the-art KPE techniques for invalidity search task. Our experiments show that queries formed by KPE approaches perform better than those formed by selecting phrases based on tf or tf-idf scores.

[1]  Xin Jiang,et al.  A ranking approach to keyphrase extraction , 2009, SIGIR.

[2]  Kazuya Konishi,et al.  Invalidity Patent Search System of NTT DATA , 2004, NTCIR.

[3]  K.V. Indukuri,et al.  Similarity Analysis of Patent Claims Using Natural Language Processing Techniques , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[4]  Tetsuya Ishikawa,et al.  Associative document retrieval by query subtopic analysis and its application to invalidity patent search , 2004, CIKM '04.

[5]  Yuen-Hsien Tseng,et al.  A study of search tactics for patentability search: a case study on patent engineers , 2008, PaIR '08.

[6]  Vincent Ng,et al.  Conundrums in Unsupervised Keyphrase Extraction: Making Sense of the State-of-the-Art , 2010, COLING.

[7]  Andreas Rauber,et al.  Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection , 2009, CIKM.

[8]  Atsushi Fujii Enhancing patent retrieval by citation analysis , 2007, SIGIR.

[9]  Laurent Romary,et al.  Experiments with Citation Mining and Key-Term Extraction for Prior Art Search , 2010, CLEF.

[10]  W. Bruce Croft,et al.  Automatic query generation for patent search , 2009, CIKM.

[11]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[12]  Jungi Kim,et al.  Cluster-Based Patent Retrieval Using International Patent Classification System , 2006, ICCPOL.

[13]  Ellis Horowitz,et al.  Extracting problem solved concepts from patent documents , 2009, PaIR@CIKM.

[14]  W. Bruce Croft,et al.  Transforming patents into prior-art queries , 2009, SIGIR.

[15]  Makoto Iwayama,et al.  Proposal of two-stage patent retrieval method considering the claim structure , 2005, TALIP.

[16]  Jungi Kim,et al.  Cluster-based patent retrieval , 2007, Inf. Process. Manag..

[17]  Xiaojun Wan,et al.  Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction , 2007, ACL.

[18]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[19]  Andreas Rauber,et al.  Analyzing Document Retrievability in Patent Retrieval Settings , 2009, DEXA.