A topic model analysis of science and technology linkages: A case study in pharmaceutical industry

Science and technology (S&T) linkages have been studied extensively using patent and scientific publication databases. Existing methods used to track S&T linkages, such as analysis of non-patent literature (NPL) or author-inventor matching offer a narrow window for industry level analysis of the data. This paper examines the application of a machine learning algorithm, namely Latent Dirichlet Allocation, to detect the semantic relationship between patent and scientific publication corpus. The case of “Taxol”, a cancer drug, is used to illustrate the performance of the unsupervised algorithm in clustering documents with similar topics. In total 26 475 documents retrieved from the Europe PMC database was used a sample for the analysis. Qualitative analysis of the clusters shows that the topic clustering algorithm is valuable approach in detection of patent and publication linkage.

[1]  Wolfgang Glänzel,et al.  Patents cited in the scientific literature: An exploratory study of 'reverse' citation relations , 2004, Scientometrics.

[2]  M. Callon,et al.  Mapping the Dynamics of Science and Technology , 1986 .

[3]  Koenraad Debackere,et al.  Traces of Prior Art: An analysis of non-patent references found in patent documents , 2006, Scientometrics.

[4]  Arho Suominen,et al.  Firms' knowledge profiles: Mapping patent data with unsupervised learning , 2017 .

[5]  H. Roberts Coward,et al.  Identifying the Science-Technology Interface: Matching Patent Data to a Bibliometric Model , 1989 .

[6]  Manuel Trajtenberg,et al.  Trends in University Patenting 1965-1992 , 1996 .

[7]  Alan L. Porter,et al.  Clustering scientific documents with topic modeling , 2014, Scientometrics.

[8]  Subhashini Venugopalan,et al.  Topic based classification and pattern identification in patents , 2015 .

[9]  L. Leydesdorff,et al.  The dynamics of innovation: from National Systems and , 2000 .

[10]  Koenraad Debackere,et al.  Linking science to technology: Using bibliographic references in patents to build linkage schemes , 2004, Scientometrics.

[11]  Martin Meyer,et al.  RETRACTED ARTICLE: Tracing Knowledge Flows in Innovation Systems—an Informetric Perspective on Future Research Science-based Innovation , 2002 .

[12]  Shu Fang,et al.  Empirical study of constructing a knowledge organization system of patent documents using topic modeling , 2014, Scientometrics.

[13]  Kimberly S. Hamilton,et al.  The increasing linkage between U.S. technology and public science , 1997 .

[14]  Thed N. van Leeuwen,et al.  Technological Relevance of Science: An Assessment of Citation Linkages between Patents and Research Papers , 2000, Scientometrics.

[15]  Martin Meyer,et al.  What is Special about Patent Citations? Differences between Scientific and Patent Citations , 2000, Scientometrics.

[16]  Arho Suominen,et al.  Modeling : Comparison of Unsupervised Learning and Human-Assigned Subject Classification , 2015 .

[17]  Jacques Michel,et al.  Patent citation analysis.A closer look at the basic input data from patent search reports , 2001, Scientometrics.

[18]  Ichiro Sakata,et al.  Extracting the commercialization gap between science and technology — Case study of a solar cell , 2010 .

[19]  Martin Meyer,et al.  Are patenting scientists the better scholars?: An exploratory comparison of inventor-authors with their non-inventing peers in nano-science and technology , 2006 .

[20]  J. Law,et al.  Mapping acidification research: A test of the co-word method , 1992, Scientometrics.

[21]  M. Meyer Does science push technology? Patents citing scientific literature , 2000 .

[22]  Ulrich Schmoch,et al.  Indicators and the relations between science and technology , 2006, Scientometrics.

[23]  Francis Narin,et al.  Status report: Linkage between technology and science , 1992 .

[24]  Z. Griliches Patent Statistics as Economic Indicators: a Survey , 1990 .

[25]  Koenraad Debackere,et al.  Do science-technology interactions pay off when developing technology? , 2004, Scientometrics.

[26]  Bart Van Looy,et al.  Sources of inspiration? Making sense of scientific references in patents , 2014, Scientometrics.