Enhancing technology clustering through heuristics by using patent counts

International Patent Classification (IPC) system is a hierarchical classification structure used essentially to classify and explore patents along with the technical fields which they are concerned with. Therefore, corresponding number of patents for a certain IPC, can serve as an indicator of technical developments in the relevant area. These numbers can also form a basis for investigating state of the art for a particular field of technology. This paper proposes an approach for clustering of patents for those of the technologies listed by IPC via the number of patent counts. A set of n real numbers indicating the patent counts for different technologies is partitioned into k clusters such that the sum of the squared deviations from the mean-value within each cluster is minimized. With this purpose in mind, two different heuristics have been considered for clustering since complete enumeration would take considerable solution time. The first heuristic is specifically proposed for this study and the second one is Great Deluge Algorithm (GDA) which has been extensively used for solving complicated problems. The proposed heuristics are coded in visual basic (VB) 6.0 and a user interface is developed for the program. The developed program attempts to find the appropriate k value in order to make the best possible clustering. As an application of the proposed clustering approach, patent data that is retrieved from web site of Turkish Patent Institute (TPI) has been used for clustering technologies.

[1]  Türkay Dereli,et al.  Patenting activities in Turkey: The case of the textile industry , 2009 .

[2]  Michael Randolph Garey,et al.  The complexity of the generalized Lloyd - Max problem , 1982, IEEE Trans. Inf. Theory.

[3]  Z. Bieniawski Engineering rock mass classifications , 1989 .

[4]  A. Törcsvári,et al.  Automated categorization of German-language patent documents , 2004, Expert Syst. Appl..

[5]  Carlos Alberto Heuser,et al.  Measuring quality of similarity functions in approximate data matching , 2007, J. Informetrics.

[6]  Zülal Güngör,et al.  K-Harmonic means data clustering with tabu-search method , 2008 .

[7]  Rain Chen,et al.  Design patent map visualization display , 2009, Expert Syst. Appl..

[8]  Amy J. C. Trappey,et al.  Development of a patent document classification and search platform using a back-propagation network , 2006, Expert Syst. Appl..

[9]  Mauricio G. C. Resende,et al.  An evolutionary algorithm for manufacturing cell formation , 2004, Comput. Ind. Eng..

[10]  Sang-Chan Park,et al.  Service-oriented Technology Roadmap (SoTRM) using patent map for R&D strategy of service industry , 2009, Expert Syst. Appl..

[11]  A. Porter,et al.  Nanopatenting patterns in relation to product life cycle , 2007 .

[12]  José Francisco Martínez Trinidad,et al.  The logical combinatorial approach to pattern recognition, an overview through selected works , 2001, Pattern Recognit..

[13]  Sang-Chan Park,et al.  Visualization of patent analysis for emerging technology , 2008, Expert Syst. Appl..

[14]  Michael F. Wolff Managing technology , 1987, IEEE Spectrum.

[15]  Andrew MacFarlane,et al.  The impact of metadata on the accuracy of automated patent classification , 2005 .

[16]  Duen-Ren Liu,et al.  Discovering competitive intelligence by mining changes in patent trends , 2010, Expert Syst. Appl..

[17]  Sungjoo Lee,et al.  An approach to discovering new technology opportunities: Keyword-based patent map approach , 2009 .

[18]  Jungi Kim,et al.  Cluster-based patent retrieval , 2007, Inf. Process. Manag..

[19]  Yongtae Park,et al.  Monitoring the organic structure of technology based on the patent development paths , 2009 .

[20]  Rafael Rumí,et al.  Supervised classification using probabilistic decision graphs , 2009, Comput. Stat. Data Anal..

[21]  Tao Huang,et al.  Patent classification system using a new hybrid genetic algorithm support vector machine , 2010, Appl. Soft Comput..

[22]  Zülal Güngör,et al.  K-harmonic means data clustering with simulated annealing heuristic , 2007, Appl. Math. Comput..

[23]  Neil Hosker,et al.  Managing technology , 1985, IEEE Spectrum.

[24]  Türkay Dereli,et al.  Classifying technology patents to identify trends: Applying a fuzzy-based clustering approach in the Turkish textile industry , 2009 .

[25]  B. V. P. D. L. Potterie,et al.  The internationalisation of technology analysed with patent data , 2001 .

[26]  Byungun Yoon,et al.  A text-mining-based patent network: Analytical tool for high-technology trend , 2004 .

[27]  G. Dueck New optimization heuristics , 1993 .

[28]  Sanja Petrovic,et al.  Hybrid variable neighbourhood approaches to university exam timetabling , 2010, Eur. J. Oper. Res..

[29]  Håkan Stille,et al.  Classification as a tool in rock engineering , 2003 .

[30]  Yongtae Park,et al.  A patent-based cross impact analysis for quantitative estimation of technological impact: The case of information and communication technology , 2007 .

[31]  Sanja Petrovic,et al.  A time-predefined local search approach to exam timetabling problems , 2004 .

[32]  P. Swamidass,et al.  Strategy, advanced manufacturing technology and performance: empirical evidence from U.S. manufacturing firms , 2000 .

[33]  G. J. Muller,et al.  CAFCR: A Multi-view Method for Embedded Systems Architecting. Balancing Genericity and Specificity , 2004 .

[34]  久志 半田,et al.  Evolutionary Algorithm を活用した改良型コモンスゲーム , 2006 .

[35]  An Wei-zhong,et al.  A simulated annealing-based approach to the optimal synthesis of heat-integrated distillation sequences , 2009, Comput. Chem. Eng..