Will this Idea Spread Beyond Academia? Understanding Knowledge Transfer of Scientific Concepts across Text Corpora

What kind of basic research ideas are more likely to get applied in practice? There is a long line of research investigating patterns of knowledge transfer, but it generally focuses on documents as the unit of analysis and follow their transfer into practice for a specific scientific domain. Here we study translational research at the level of scientific concepts for all scientific fields. We do this through text mining and predictive modeling using three corpora: 38.6 million paper abstracts, 4 million patent documents, and 0.28 million clinical trials. We extract scientific concepts (i.e., phrases) from corpora as instantiations of "research ideas", create concept-level features as motivated by literature, and then follow the trajectories of over 450,000 new concepts (emerged from 1995-2014) to identify factors that lead only a small proportion of these ideas to be used in inventions and drug trials. Results from our analysis suggest several mechanisms that distinguish which scientific concept will be adopted in practice, and which will not. We also demonstrate that our derived features can be used to explain and predict knowledge transfer with high accuracy. Our work provides greater understanding of knowledge transfer for researchers, practitioners, and government agencies interested in encouraging translational research.

[1]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[2]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[3]  Danielle Li,et al.  The applied value of public investments in biomedical research , 2017, Science.

[4]  Vincent Larivière,et al.  Are elite journals declining? , 2013, J. Assoc. Inf. Sci. Technol..

[5]  D. Berwick Disseminating innovations in health care. , 2003, JAMA.

[6]  E. Rogers Diffusion of Innovations , 1962 .

[7]  M. Rossiter The Matthew Matilda Effect in Science , 1993 .

[8]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[9]  Jure Leskovec,et al.  Modeling Interdependent and Periodic Real-World Action Sequences , 2018, WWW.

[10]  Stephen Toulmin,et al.  Human Understanding, Volume I: The Collective Use and Evolution of Concepts , 1977 .

[11]  S. McDonald,et al.  Social Capital Across the Life Course: Age and Gendered Patterns of Network Resources1 , 2010 .

[12]  Raymond J. Devettere,et al.  Human Understanding. Volume I: The Collective Use and Evolution of Concepts , 1973 .

[13]  Daniel A. McFarland,et al.  The Diversity–Innovation Paradox in Science , 2019, Proceedings of the National Academy of Sciences.

[14]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Pierre Azoulay,et al.  Superstar Extinction , 2008 .

[16]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[17]  Francis Narin,et al.  Is technology becoming science? , 1985, Scientometrics.

[18]  Cliff Chiung-Yu Lin,et al.  Rise of the Rest: The Growing Impact of Non-Elite Journals , 2014, ArXiv.

[19]  Chip Heath,et al.  Idea Habitats: How the Prevalence of Environmental Cues Influences the Success of Ideas , 2005, Cogn. Sci..

[20]  Jure Leskovec,et al.  Modeling Affinity based Popularity Dynamics , 2017, CIKM.

[21]  Jiawei Han,et al.  Automated Phrase Mining from Massive Text Corpora , 2017, IEEE Transactions on Knowledge and Data Engineering.

[22]  W. A. Sumner,et al.  A recalculation of four adult readability formulas. , 1958 .

[23]  Stefano Bertuzzi,et al.  Measuring the Results of Science Investments , 2011, Science.

[24]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[25]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[26]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[27]  Benjamin F. Jones,et al.  The dual frontier: Patented inventions and prior scientific advance , 2017, Science.

[28]  Wiebe E. Bijker,et al.  Science in action : how to follow scientists and engineers through society , 1989 .

[29]  Jure Leskovec,et al.  Modeling Individual Cyclic Variation in Human Behavior , 2017, WWW.

[30]  Jure Leskovec,et al.  Citing for high impact , 2010, JCDL '10.

[31]  Tim Hallett,et al.  Public Ideas: Their Varieties and Careers , 2019, American Sociological Review.

[32]  R. Tijssen Global and domestic utilization of industrial relevant science: patent citation analysis of science-technology interactions and knowledge flows , 2001 .