Prior-Art Relevance Ranking Based on the Examiner's Query Log Content

This work belongs to the domain of technical information retrieval (IR) and, more specifically, patent retrieval. We show that the recorded history of patent examiner’s search queries can be used to create a more effective method of finding prior art patents than search methods based on titles and claims. We verify the performance of the proposed method experimentally. Our experiments show that we can almost double the recall measure, compared to classical techniques based on titles and claims. The other contribution of our work is the creation of a database of over half a million patent examiners queries (recorded search activity over the patents prosecution process). The paper also discusses the limitations of the current work and the ongoing research to further improve the proposed approach.

[1]  Walid Magdy,et al.  A study on query expansion methods for patent retrieval , 2011, PaIR '11.

[2]  W. Bruce Croft,et al.  Automatic boolean query suggestion for professional search , 2011, SIGIR.

[3]  Hsiao-Chun Wu,et al.  A method for assessing patent similarity using direct and indirect citation links , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[4]  Kalina Bontcheva,et al.  Large-scale, parallel automatic patent annotation , 2008, PaIR '08.

[5]  Eero Sormunen,et al.  A novel method for the evaluation of Boolean query effectiveness across a wide operational range , 2000, SIGIR '00.

[6]  Mihai Lupu Patent information retrieval: an instance of domain-specific search , 2012, SIGIR '12.

[7]  Xuemin Lin,et al.  Efficient processing of graph similarity queries with edit distance constraints , 2013, The VLDB Journal.

[8]  Sung-Hyon Myaeng,et al.  Query Enhancement for Patent Prior-Art-Search Based on Keyterm Dependency Relations and Semantic Tags , 2012, IRFC.

[9]  Víctor Fresno-Fernández,et al.  Integrating the Probabilistic Models BM25/BM25F into Lucene , 2009, ArXiv.

[10]  Atsushi Fujii Enhancing patent retrieval by citation analysis , 2007, SIGIR.

[11]  Felipe Bravo-Marquez,et al.  A Text Similarity Meta-Search Engine Based on Document Fingerprints and Search Results Records , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[12]  Johannes Leveling,et al.  United we fall, divided we stand: a study of query segmentation and prf for patent prior art search , 2011, PaIR '11.

[13]  W. Bruce Croft,et al.  Transforming patents into prior-art queries , 2009, SIGIR.

[14]  Wim Vanderbauwhede,et al.  A survey of patent users: an analysis of tasks, behavior, search functionality and system requirements , 2010, IIiX.

[15]  Preben Hansen,et al.  Going beyond CLEF-IP: The 'Reality' for Patent Searchers? , 2012, CLEF.

[16]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[17]  Andreas Rauber,et al.  Improving Retrievability of Patents in Prior-Art Search , 2010, ECIR.

[18]  Mostafa Keikha,et al.  Automatic refinement of patent queries using concept importance predictors , 2012, SIGIR '12.

[19]  Thorsten Teichert,et al.  Inventive progress measured by multi-stage patent citation analysis , 2005 .

[20]  Koenraad Debackere,et al.  Traces of Prior Art: An analysis of non-patent references found in patent documents , 2006, Scientometrics.

[21]  Andreas Rauber,et al.  Analyzing Query Logs of USPTO Examiners to Identify Useful Query Terms in Patent Documents for Query Expansion in Patent Searching: A Preliminary Study , 2012, IRFC.

[22]  P. K. Sinha,et al.  A survey of query log processing techniques and evaluation of web query intent identification , 2013, 2013 3rd IEEE International Advance Computing Conference (IACC).

[23]  Wolfgang Nejdl,et al.  Introduction to the special section on twitter and microblogging services , 2013, TIST.

[24]  W. Bruce Croft,et al.  Automatic query generation for patent search , 2009, CIKM.

[25]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[26]  John Yen,et al.  CV-PCR: a context-guided value-driven framework for patent citation recommendation , 2013, CIKM.

[27]  Tetsuya Ishikawa,et al.  Associative document retrieval by query subtopic analysis and its application to invalidity patent search , 2004, CIKM '04.

[28]  Andreas Rauber,et al.  Mining Query Logs of USPTO Patent Examiners , 2013, CLEF.

[29]  Ellis Horowitz,et al.  FindCite: automatically finding prior art patents , 2009 .

[30]  Karen Sparck Jones A statistical interpretation of term specificity and its application in retrieval , 1972 .

[31]  Amanda Spink,et al.  Determining the user intent of web search engine queries , 2007, WWW '07.

[32]  Wim Vanderbauwhede,et al.  Search system requirements of patent analysts , 2010, SIGIR '10.

[33]  Jian Pei,et al.  Mining search and browse logs for web search , 2013, ACM Trans. Intell. Syst. Technol..