Identifying Transformative Scientific Research

Transformative research refers to research that shifts or disrupts established scientific paradigms. Notable examples include the discovery of high-temperature superconductivity that disrupted the theory established 30 years ago. Identifying potential transformative research early and accurately is important for funding agencies to maximize the impact of their investments. It also helps scientists identify and focus their attention on promising emerging works. This paper presents a data driven approach where citation patterns of scientific papers are analyzed to quantify how much a potential challenger idea shifts an established paradigm. The key idea is that transformative research creates an observable disruption in the structure of "information cascades," chains of references that can be traced back to the papers establishing some scientific paradigm. Such a disruption is visible soon after the challenger's introduction. We define a disruption score to quantify the disruption and develop an algorithm to compute it from a large citation network. Experimental results show that our approach can successfully identify transformative scientific papers that disrupt established paradigms in Physics and Computer Science, regardless of whether the challenger paradigm is an instant hit or a classic whose contribution is formally recognized with a Nobel Prize decades later.

[1]  Andre K. Geim,et al.  Two-dimensional atomic crystals. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Sergei Maslov,et al.  Finding scientific gems with Google's PageRank algorithm , 2006, J. Informetrics.

[3]  Ruoming Jin,et al.  Topic level expertise search over heterogeneous networks , 2010, Machine Learning.

[4]  R. Merton The Matthew Effect in Science, II: Cumulative Advantage and the Symbolism of Intellectual Property , 1988, Isis.

[5]  Chaomei Chen,et al.  Searching for intellectual turning points: Progressive knowledge domain visualization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Anna Kuchment,et al.  The half-life of facts: Why everything we know has an expiration date. , 2012 .

[7]  S. Iijima Helical microtubules of graphitic carbon , 1991, Nature.

[8]  Jie Tang,et al.  A Combination Approach to Web User Profiling , 2010, TKDD.

[9]  Christos Faloutsos,et al.  Cascading Behavior in Large Blog Graphs , 2007 .

[10]  Lise Getoor,et al.  FutureRank: Ranking Scientific Articles by Predicting their Future PageRank , 2009, SDM.

[11]  J. S. Long,et al.  Cumulative Advantage and Inequality in Science , 1982 .

[12]  S. Wooding,et al.  The answer is 17 years, what is the question: understanding time lags in translational research , 2011, Journal of the Royal Society of Medicine.

[13]  K. A. Müller,et al.  Possible High T cSuperconductivity in the Ba — La — Cu — O System , 1993 .

[14]  R. Merton The Matthew Effect in Science , 1968, Science.

[15]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[16]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[17]  S. Redner Citation statistics from 110 years of physical review , 2005, physics/0506056.

[18]  Jie Tang,et al.  Social Network Extraction of Academic Researchers , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[19]  A. B. Kahn,et al.  Topological sorting of large networks , 1962, CACM.

[20]  H. P. Dalen,et al.  Attention and the art of scientific publishing , 2001 .

[21]  K. Müller,et al.  Possible highTc superconductivity in the Ba−La−Cu−O system , 1986 .

[22]  Santo Fortunato,et al.  How Citation Boosts Promote Scientific Paradigm Shifts and Nobel Prizes , 2011, PloS one.

[23]  K. Cheng Theory of Superconductivity , 1948, Nature.

[24]  Zhang,et al.  Effective Hamiltonian for the superconducting Cu oxides. , 1988, Physical review. B, Condensed matter.

[25]  Kristina Lerman,et al.  Information Contagion: An Empirical Study of the Spread of News on Digg and Twitter Social Networks , 2010, ICWSM.

[26]  Helmut Eschrig,et al.  Microscopic theory of superconductivity , 1969 .

[27]  Daniel G. Goldstein,et al.  The structure of online diffusion networks , 2012, EC '12.

[28]  R. Haynes,et al.  Knowledge translation: closing the evidence-to-practice gap. , 2007, Annals of emergency medicine.

[29]  T. Kuhn,et al.  The Structure of Scientific Revolutions: 50th Anniversary Edition , 2012 .

[30]  Marjori Matzke,et al.  F1000Prime recommendation of An index to quantify an individual's scientific research output. , 2005 .

[31]  Andre K. Geim,et al.  Electric Field Effect in Atomically Thin Carbon Films , 2004, Science.

[32]  Kristina Lerman,et al.  A framework for quantitative analysis of cascades on networks , 2010, WSDM '11.

[33]  Jure Leskovec,et al.  Clash of the Contagions: Cooperation and Competition in Information Diffusion , 2012, 2012 IEEE 12th International Conference on Data Mining.

[34]  P. Allison Inequality and Scientific Productivity , 1980 .

[35]  Sidney Redner,et al.  Community structure of the physical review citation network , 2009, J. Informetrics.

[36]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[37]  Shou-De Lin,et al.  Time-Aware Ranking in Dynamic Citation Networks , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.