A hybrid similarity measure method for patent portfolio analysis

Similarity measures are fundamental tools for identifying relationships within or across patent portfolios. Many bibliometric indicators are used to determine similarity measures; for example, bibliographic coupling, citation and co-citation, and co-word distribution. This paper aims to construct a hybrid similarity measure method based on multiple indicators to analyze patent portfolios. Two models are proposed: categorical similarity and semantic similarity. The categorical similarity model emphasizes international patent classifications (IPCs), while the semantic similarity model emphasizes textual elements. We introduce fuzzy set routines to translate the rough technical (sub-) categories of IPCs into defined numeric values, and we calculate the categorical similarities between patent portfolios using membership grade vectors. In parallel, we identify and highlight core terms in a 3-level tree structure and compute the semantic similarities by comparing the tree-based structures. A weighting model is designed to consider: 1) the bias that exists between the categorical and semantic similarities, and 2) the weighting or integrating strategy for a hybrid method. A case study to measure the technological similarities between selected firms in China’s medical device industry is used to demonstrate the reliability our method, and the results indicate the practical meaning of our method in a broad range of informetric applications.

[1]  Douglas K. R. Robinson,et al.  Analyzing research publication patterns to gauge future innovation pathways for Nano-Enabled Drug Delivery , 2013, 2013 Proceedings of PICMET '13: Technology Management in the IT-Driven Services (PICMET).

[2]  Per Ahlgren,et al.  Document-document similarity approaches and science mapping: Experimental comparison of five approaches , 2009, J. Informetrics.

[3]  Wolfgang G. Stock,et al.  Handbook of Information Science , 2013 .

[4]  Ed C. M. Noyons,et al.  A unified approach to mapping and clustering of bibliometric networks , 2010, J. Informetrics.

[5]  Diana Lucio-Arias,et al.  Main-path analysis and path-dependent transitions in HistCite™-based historiograms , 2008 .

[6]  Yuen-Hsien Tseng,et al.  Text mining techniques for patent analysis , 2007, Inf. Process. Manag..

[7]  Kevin W. Boyack,et al.  Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? , 2010 .

[8]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[9]  Alan L. Porter,et al.  A systematic method to create search strategies for emerging technologies based on the Web of Science: illustrated for ‘Big Data’ , 2015, Scientometrics.

[10]  Kevin W. Boyack,et al.  Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches , 2011, PloS one.

[11]  Ismael Rafols,et al.  Interactive overlay maps for US patent (USPTO) data based on International Patent Classification (IPC) , 2012, Scientometrics.

[12]  Alan L. Porter,et al.  A patent analysis method to trace technology evolutionary pathways , 2014, Scientometrics.

[13]  Arie Rip,et al.  Mapping of science: possibilities and limitations , 1988 .

[14]  Loet Leydesdorff,et al.  On the normalization and visualization of author co-citation data: Salton's Cosine versus the Jaccard index , 2008 .

[15]  Eugene Garfield,et al.  THE USE OF CITATION DATA IN WRITING THE HISTORY OF SCIENCE , 1964 .

[16]  Henk F. Moed,et al.  Mapping of science by combined co-citation and word analysis. II: Dynamical aspects , 1991 .

[17]  A. Törcsvári,et al.  Automated categorization in the international patent classification , 2003, SIGF.

[18]  Kevin W. Boyack,et al.  Which Type of Citation Analysis Generates the Most Accurate Taxonomy of Scientific and Technical Knowledge? , 2015, J. Assoc. Inf. Sci. Technol..

[19]  Henk F. Moed,et al.  Mapping of Science : Critical elaboration and new approaches, a case study in agricultural biochemistry , 1988 .

[20]  Ichiro Sakata,et al.  Knowledge combination modeling: The measurement of knowledge similarity between different technological domains , 2015 .

[21]  Jie Lu,et al.  Similarity measure models and algorithms for hierarchical cases , 2011, Expert Syst. Appl..

[22]  Peter J. Lane,et al.  Complementary Technologies, Knowledge Relatedness, and Invention Outcomes in High Technology Mergers and Acquisitions , 2009 .

[23]  T. Saaty How to Make a Decision: The Analytic Hierarchy Process , 1990 .

[24]  Hua Lin,et al.  A hybrid fuzzy-based personalized recommender system for telecom products/services , 2013, Inf. Sci..

[25]  Alan L. Porter,et al.  Clustering scientific documents with topic modeling , 2014, Scientometrics.

[26]  Alan L. Porter,et al.  Technology opportunities analysis , 1995 .

[27]  A. Jaffe Technological Opportunity and Spillovers of R&D: Evidence from Firms&Apos; Patents, Profits and Market Value , 1986 .

[28]  Donna K. Harman,et al.  Collaborative information seeking and retrieval , 2006 .

[29]  Kuei-Kuei Lai,et al.  Patent priority network: Linking patent portfolio to strategic goals , 2009 .

[30]  Ludo Waltman,et al.  Software survey: VOSviewer, a computer program for bibliometric mapping , 2009, Scientometrics.

[31]  Yongtae Park,et al.  Monitoring the organic structure of technology based on the patent development paths , 2009 .

[32]  Martin G. Moehrle Measures for textual patent similarities: a guided way to select appropriate approaches , 2010, Scientometrics.

[33]  Alan L. Porter,et al.  Automated extraction and visualization of information for technological intelligence and forecasting , 2002 .

[34]  David Sánchez,et al.  Ontology-based semantic similarity: A new feature-based approach , 2012, Expert Syst. Appl..

[35]  Kevin W. Boyack,et al.  Mapping the backbone of science , 2004, Scientometrics.

[36]  Alan L. Porter,et al.  Topic analysis and forecasting for science, technology and innovation: Methodology with a case study focusing on big data research , 2016 .

[37]  Kevin W. Boyack,et al.  Toward a consensus map of science , 2009 .

[38]  Ronald Rousseau,et al.  Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula , 1989, Inf. Process. Manag..

[39]  Alan L. Porter,et al.  “Term clumping” for technical intelligence: A case study on dye-sensitized solar cells , 2014 .

[40]  Peter Willett,et al.  The Limitations of Term Co-Occurrence Data for Query Expansion in Document Retrieval Systems , 1991 .

[41]  Kwangsoo Kim,et al.  Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining , 2013, Scientometrics.

[42]  Mu-Hsuan Huang,et al.  Identifying missing relevant patent citation links by using bibliographic coupling in LED illuminating technology , 2011, J. Informetrics.

[43]  Peera Charoenporn,et al.  Impact of stronger patent regimes on technology transfer: The case study of Thai automotive industry , 2015 .

[44]  Jianhua Hou,et al.  The structure and dynamics of cocitation clusters: A multiple-perspective cocitation analysis , 2010 .

[45]  Jie Lu,et al.  A Method for Multiple Periodic Factor Prediction Problems Using Complex Fuzzy Sets , 2012, IEEE Transactions on Fuzzy Systems.

[46]  Kevin W. Boyack,et al.  Measuring science-technology interaction using rare inventor-author names , 2008, J. Informetrics.

[47]  M. Callon,et al.  From translations to problematic networks: An introduction to co-word analysis , 1983 .

[48]  Jianhua Hou,et al.  The structure and dynamics of cocitation clusters: A multiple-perspective cocitation analysis , 2010, J. Assoc. Inf. Sci. Technol..

[49]  Byungun Yoon,et al.  A systematic approach of partner selection for open innovation , 2014, Ind. Manag. Data Syst..

[50]  M. M. Kessler Bibliographic coupling between scientific papers , 1963 .

[51]  Yen-Liang Chen,et al.  A three-phase method for patent classification , 2012, Inf. Process. Manag..

[52]  H. Ernst,et al.  Patent portfolio analysis as a useful tool for identifying R&D and business opportunities--an empirical application in the nutrition and health industry , 2006 .

[53]  Kevin W. Boyack,et al.  Identifying a better measure of relatedness for mapping science , 2006 .

[54]  Alan L. Porter,et al.  How to combine term clumping and technology roadmapping for newly emerging science & technology competitive intelligence: “problem & solution” pattern based semantic TRIZ tool and case study , 2014, Scientometrics.

[55]  Loh Han Tong,et al.  Grouping of TRIZ Inventive Principles to facilitate automatic patent classification , 2008 .

[56]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[57]  Alan L. Porter,et al.  Mining External R&D , 2011 .

[58]  Alexander I. Pudovkin,et al.  Algorithmic procedure for finding semantically related journals , 2002, J. Assoc. Inf. Sci. Technol..

[59]  Yi Zhang,et al.  China's patterns of international technological collaboration 1976–2010: a patent analysis study , 2014, Technol. Anal. Strateg. Manag..

[60]  Kwangsoo Kim,et al.  Identifying technological competition trends for R&D planning using dynamic patent maps: SAO-based content analysis , 2012, Scientometrics.

[61]  Jean-Charles Lamirel,et al.  Feature-based cluster validation for high-dimensional data , 2008 .

[62]  Chaomei Chen,et al.  Emerging trends in regenerative medicine: a scientometric analysis in CiteSpace , 2012, Expert opinion on biological therapy.

[63]  Key-Sun Choi,et al.  Patent document categorization based on semantic structural information , 2007, Inf. Process. Manag..

[64]  Ed C. M. Noyons,et al.  Automatic term identification for bibliometric mapping , 2008, Scientometrics.

[65]  Anthony F. J. van Raan,et al.  Advanced mapping of science and technology , 2006, Scientometrics.

[66]  A. Törcsvári,et al.  Automated categorization of German-language patent documents , 2004, Expert Syst. Appl..

[67]  Alan L. Porter,et al.  Patent overlay mapping: Visualizing technological distance , 2012, J. Assoc. Inf. Sci. Technol..

[68]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[69]  Eric Tsui,et al.  An ontology-based similarity measurement for problem-based case reasoning , 2009, Expert Syst. Appl..

[70]  Loet Leydesdorff,et al.  Mapping (USPTO) Patent Data using Overlays to Google Maps , 2011, J. Assoc. Inf. Sci. Technol..