The beauty of brimstone butterfly: novelty of patents identified by near environment analysis based on text mining

The novelty of a patent may be seen as those patterns that distinguishes it from other patents and scientific literature. Its understanding may serve for many purposes, both in scientometric research and in the management of technological information. While many methods exist that deal with a patent’s meta-information like citation networks or co-classification analysis, the analysis of novelty in the full text of a patent is still at the beginning of research and in practice a time-consuming manual task. The question we pose is whether computer-based text mining methods are able to identify those elements of such a patent that make it novel from a technological and application/market perspective. For this purpose we introduce and operationalize the concept of near environment analysis and use a three-step text mining approach on one of the patents nominated as finalist in the 2012 European Inventor Award contest. We demonstrate that such an approach is able to single out, content-wise in a near environment, the novelty of the patent. The method can be used also for other patents and—with adaption of the near environment analysis—for scientific literature.

[1]  Kam-Fai Wong,et al.  Interpreting TF-IDF term weights as making relevance decisions , 2008, TOIS.

[2]  Nicolas van Zeebroeck,et al.  The puzzle of patent value indicators , 2011 .

[3]  Benkichi Jinbo On the Patent Management , 1961 .

[4]  Martin G. Moehrle,et al.  Measuring textual patent similarity on the basis of combined concepts: design decisions and their consequences , 2012, Scientometrics.

[5]  Yuen-Hsien Tseng,et al.  Text mining techniques for patent analysis , 2007, Inf. Process. Manag..

[6]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[7]  Thomas Klose,et al.  Text mining and visualization tools - Impressions of emerging capabilities , 2008 .

[8]  Kwangsoo Kim,et al.  A patent intelligence system for strategic technology planning , 2013, Expert Syst. Appl..

[9]  Lothar Walter,et al.  Patentmanagement: Recherche, Analyse, Strategie , 2016 .

[10]  Samee U. Khan,et al.  A literature review on the state-of-the-art in patent analysis , 2014 .

[11]  Jochen Dörre,et al.  Text mining: finding nuggets in mountains of textual data , 1999, KDD '99.

[12]  M. Miles Qualitative Data as an Attractive Nuisance: The Problem of Analysis , 1979 .

[13]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[14]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[15]  Martin G. Moehrle,et al.  Patinformatics as a business process: A guideline through patent research tasks and tools , 2010 .

[16]  Martin G. Moehrle,et al.  Car2X-Communication mirrored by business method patents: What documented inventions can tell us about the future , 2013, 2013 Proceedings of PICMET '13: Technology Management in the IT-Driven Services (PICMET).

[17]  C. Schell The Value of the Case Study as a Research Strategy , 2004 .

[18]  Chaomei Chen,et al.  Tech Mining: Exploiting New Technologies for Competitive Advantage , 2005, Inf. Process. Manag..

[19]  Qingyu Zhang,et al.  Review of data, text and web mining software , 2010, Kybernetes.

[20]  Anthony J. Trippe,et al.  Patinformatics: Tasks to tools , 2003 .

[21]  R. Yin Yin, Robert K., Case Study Research: Design and Methods, 2nd ed. Newbury Park, CA: Sage, 1994. , 1994 .

[22]  Bart Van Looy,et al.  Inventions shaping technological trajectories: do existing patent indicators provide a comprehensive picture? , 2013, Scientometrics.

[23]  Peng Xu,et al.  Finding nuggets in IP portfolios: core patent mining through textual temporal analysis , 2012, CIKM '12.

[24]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[25]  Martin G. Moehrle Measures for textual patent similarities: a guided way to select appropriate approaches , 2010, Scientometrics.

[26]  魏屹东,et al.  Scientometrics , 2018, Encyclopedia of Big Data.

[27]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[28]  Ophir Frieder,et al.  Repeatable evaluation of search services in dynamic environments , 2007, TOIS.

[29]  Mu-Hsuan Huang,et al.  Identifying and visualizing technology evolution: A case study of smart grid technology , 2012 .

[30]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.