Patent Clustering and Outlier Ranking Methodologies for Attributed Patent Citation Networks for Technology Opportunity Discovery

Effectively ranking patents in outlierness in a patent citation network is a crucial task for patent analysis, including as it relates to technological opportunity discovery (TOD). Previous studies in the area of TOD focus on patent textual data. In this paper, we introduce a new approach that addresses TOD via patent outlierness, leveraging both patent attributes and citations. We propose the following characteristics for patent outliers: 1) not highly clustered with other patents; 2) low node centrality within the citation network; and 3) low similarity to other patents in the network. Existing outlier ranking approaches have the drawback of not leveraging the unique characteristics of attributed patent citation networks. We propose new outlier ranking methods developed specifically for patents in attributed patent citation networks. Attribute data independently describe a patent, while citation network data relate patents to each other, thus capturing patent outlierness from two different aspects. The contributions of this paper are, given an attributed patent citation network: 1) patent clustering algorithm, and 2) method for scoring and ranking patents in outlierness. Developed methods are validated using artificial datasets. Proposed outlier ranking methods are evaluated using U.S. patents in the area of digital information and security.

[1]  Daniele Rotolo,et al.  Determinants of Patent Citations in Biotechnology: An Analysis of Patent Influence Across the Industrial and Organizational Boundaries , 2014, ArXiv.

[2]  Myong Kee Jeong,et al.  Graph kernel based measure for evaluating the influence of patents in a patent citation network , 2015, Expert Syst. Appl..

[3]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[4]  Diane J. Cook,et al.  Graph-based anomaly detection , 2003, KDD '03.

[5]  Steve Harenberg,et al.  Anomaly detection in dynamic networks: a survey , 2015 .

[6]  Vasilios A. Siris,et al.  Application of anomaly detection algorithms for detecting SYN flooding attacks , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[7]  Christos Faloutsos,et al.  Big graph mining: algorithms and discoveries , 2013, SKDD.

[8]  Hsin-Yu Shih,et al.  Patent citation network analysis of core and emerging technologies in Taiwan: 1997–2008 , 2011, Scientometrics.

[9]  Pang-Ning Tan,et al.  Outlier Detection Using Random Walks , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[10]  Ronald N. Kostoff,et al.  Science and technology roadmaps , 2001, IEEE Trans. Engineering Management.

[11]  Leman Akoglu,et al.  Big Graph Mining: Algorithms, Anomaly Detection, and Applications , 2013 .

[12]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[13]  Yan Lin,et al.  Backbone of technology evolution in the modern era automobile industry: An analysis by the patents citation network , 2011 .

[14]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[15]  Myong Kee Jeong,et al.  Inter-cluster connectivity analysis for technology opportunity discovery , 2014, Scientometrics.

[16]  Kwangsoo Kim,et al.  Identification of promising patents for technology transfers using TRIZ evolution trends , 2013, Expert Syst. Appl..

[17]  Bernard Gress,et al.  Properties of the USPTO patent citation network: 1963-2002 , 2010 .

[18]  Kwangsoo Kim,et al.  Detecting signals of new technological opportunities using semantic patent analysis and outlier detection , 2011, Scientometrics.

[19]  Christos Faloutsos,et al.  Detecting Fraudulent Personalities in Networks of Online Auctioneers , 2006, PKDD.

[20]  Lawrence B. Holder,et al.  Mining Graph Data: Cook/Mining Graph Data , 2006 .

[21]  Bangrae Lee,et al.  Automated Detection of Influential Patents Using Singular Values , 2012, IEEE Transactions on Automation Science and Engineering.

[22]  Byungun Yoon,et al.  A systematic approach for identifying technology opportunities: Keyword-based morphology analysis , 2005 .

[23]  Alan L. Porter,et al.  Technology opportunities analysis , 1995 .

[24]  Yizhou Sun,et al.  On community outliers and their efficient detection in information networks , 2010, KDD.

[25]  Emmanuel Müller,et al.  Focused clustering and outlier detection in large attributed graphs , 2014, KDD.

[26]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[27]  Thomas Seidl,et al.  Subspace Clustering Meets Dense Subgraph Mining: A Synthesis of Two Paradigms , 2010, 2010 IEEE International Conference on Data Mining.

[28]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[29]  Yuen-Hsien Tseng,et al.  Text mining techniques for patent analysis , 2007, Inf. Process. Manag..

[30]  Klemens Böhm,et al.  Ranking outlier nodes in subspaces of attributed graphs , 2013, 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW).

[31]  Yongtae Park,et al.  Development of New Technology Forecasting Algorithm: Hybrid Approach for Morphology Analysis and Conjoint Analysis of Patent Information , 2007, IEEE Transactions on Engineering Management.

[32]  D. Bosworth The Rate of Obsolescence of Technical Knowledge-A Note , 1978 .

[33]  Christos Faloutsos,et al.  EigenDiagnostics: Spotting Connection Patterns and Outliers in Large Graphs , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[34]  B. C. Brookes THE GROWTH, UTILITY, AND OBSOLESCENCE OF SCIENTIFIC PERIODICAL LITERATURE , 1970 .

[35]  Kwangsoo Kim,et al.  SAO network analysis of patents for technology trends identification: a case study of polymer electrolyte membrane technology in proton exchange membrane fuel cells , 2011, Scientometrics.

[36]  Myong Kee Jeong,et al.  New multi-stage similarity measure for calculation of pairwise patent similarity in a patent citation network , 2015, Scientometrics.

[37]  Won-Kyung Sung,et al.  Decision-Making Support Service Based on Technology Opportunity Discovery Model , 2011, FGIT-UNESST.

[38]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[39]  Byungun Yoon,et al.  On the development of a technology intelligence tool for identifying technology opportunity , 2008, Expert Syst. Appl..

[40]  Sungjoo Lee,et al.  An approach to discovering new technology opportunities: Keyword-based patent map approach , 2009 .

[41]  Lawrence B. Holder,et al.  Anomaly detection in data represented as graphs , 2007, Intell. Data Anal..

[42]  E. Candès,et al.  Detection of an anomalous cluster in a network , 2010, 1001.3209.

[43]  Lawrence B. Holder,et al.  Graph-Based Data Mining , 2000, IEEE Intell. Syst..

[44]  Sang-Chan Park,et al.  Visualization of patent analysis for emerging technology , 2008, Expert Syst. Appl..

[45]  Kwangsoo Kim,et al.  Identifying rapidly evolving technological trends for R&D planning using SAO-based semantic patent networks , 2011, Scientometrics.

[46]  Maarten van Steen,et al.  Graph Theory and Complex Networks: An Introduction , 2010 .

[47]  M. Karvonen,et al.  Patent citation analysis as a tool for analysing industry convergence , 2011, 2011 Proceedings of PICMET '11: Technology Management in the Energy Smart World (PICMET).

[48]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[49]  Thorsten Teichert,et al.  Inventive progress measured by multi-stage patent citation analysis , 2005 .