Domain lexicon-based query expansion for patent retrieval

Patent retrieval is important for technology survey and knowledge protection. Its aim is to search as many patent documents relevant to the patent document query as possible, which is considered as a recall-oriented task. However, existing methods suffer from the term mismatch problem caused by the frequent use of many non-standard technical terminologies in patents. To address the issue, we present a novel patent retrieval approach by utilizing domain lexicon-based query expansion. In particular, we present the domain lexicon construction scheme and the domain lexicon-based query expansion algorithm to augment the query with suitable expansion concepts. Experimental results on the CLEF-IP patent data set demonstrate that the proposed approach can achieve significant improvement in retrieval performance compared to several typical methods.

[1]  Lanfen Lin,et al.  Query construction based on concept importance for effective patent retrieval , 2015, 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[2]  Atsushi Fujii Enhancing patent retrieval by citation analysis , 2007, SIGIR.

[3]  James Allan,et al.  Entity query feature expansion using knowledge base links , 2014, SIGIR.

[4]  Xiangji Huang,et al.  Evaluation of Chemical Information Retrieval Tools , 2011, Current Challenges in Patent Information Retrieval.

[5]  Fabio Crestani,et al.  Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval , 2013, SIGIR.

[6]  Walid Magdy,et al.  Patent query reduction using pseudo relevance feedback , 2011, CIKM '11.

[7]  Sung-Hyon Myaeng,et al.  Wikipedia-based query phrase expansion in patent class search , 2013, Information Retrieval.

[8]  Fulvio Corno,et al.  Review of the state-of-the-art in patent information and forthcoming evolutions in intelligent patent informatics , 2010 .

[9]  Walid Magdy,et al.  Simple vs. Sophisticated Approaches for Patent Prior-Art Search , 2011, ECIR.

[10]  Xiangji Huang,et al.  Proximity-based rocchio's model for pseudo relevance , 2012, SIGIR '12.

[11]  Ellen M. Voorhees,et al.  Retrieval System Evaluation , 2005 .

[12]  Mandar Mitra,et al.  Improving query expansion using WordNet , 2013, J. Assoc. Inf. Sci. Technol..

[13]  Sung-Hyon Myaeng,et al.  Query Phrase Expansion Using Wikipedia in Patent Class Search , 2011, AIRS.

[14]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[15]  W. Bruce Croft,et al.  Transforming patents into prior-art queries , 2009, SIGIR.

[16]  Andreas M. Kaplan,et al.  Collaborative projects (social media application): About Wikipedia, the free encyclopedia , 2014 .

[17]  Mostafa Keikha,et al.  Building Queries for Prior-Art Search , 2011, IRFC.

[18]  Aditi Sharan,et al.  Context Window Based Co-occurrence Approach for Improving Feedback Based Query Expansion in Information Retrieval , 2015, Int. J. Inf. Retr. Res..

[19]  John Tait,et al.  CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain , 2009, CLEF.

[20]  Mostafa Keikha,et al.  Automatic refinement of patent queries using concept importance predictors , 2012, SIGIR '12.

[21]  Klamer Schutte,et al.  Knowledge based query expansion in complex multimedia event detection , 2016, Multimedia Tools and Applications.

[22]  John Tait,et al.  Current Challenges in Patent Information Retrieval , 2011, The Information Retrieval Series.

[23]  Walid Magdy,et al.  PRES: a score metric for evaluating recall-oriented information retrieval applications , 2010, SIGIR.

[24]  Walid Magdy,et al.  A study on query expansion methods for patent retrieval , 2011, PaIR '11.

[25]  Lanfen Lin,et al.  A semantic query expansion-based patent retrieval approach , 2013, 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[26]  Wim Vanderbauwhede,et al.  Search system requirements of patent analysts , 2010, SIGIR '10.

[27]  W. Bruce Croft,et al.  Indri : A language-model based search engine for complex queries ( extended version ) , 2005 .

[28]  Noriko Kando,et al.  Overview of the Patent Retrieval Task at the NTCIR-6 Workshop , 2007, NTCIR.

[29]  Kazuaki Kishida Experiment on Pseudo Relevance Feedback Method Using Taylor Formula at NTCIR-3 Patent Retrieval Task , 2002, NTCIR.