A dataset of publication records for Nobel laureates

A central question in the science of science concerns how to develop a quantitative understanding of the evolution and impact of individual careers. Over the course of history, a relatively small fraction of individuals have made disproportionate, profound, and lasting impacts on science and society. Despite a long-standing interest in the careers of scientific elites across diverse disciplines, it remains difficult to collect large-scale career histories that could serve as training sets for systematic empirical and theoretical studies. Here, by combining unstructured data collected from CVs, university websites, and Wikipedia, together with the publication and citation database from Microsoft Academic Graph (MAG), we reconstructed publication histories of nearly all Nobel prize winners from the past century, through both manual curation and algorithmic disambiguation procedures. Data validation shows that the collected dataset presents among the most comprehensive collection of publication records for Nobel laureates currently available. As our quantitative understanding of science deepens, this dataset is expected to have increasing value. It will not only allow us to quantitatively probe novel patterns of productivity, collaboration, and impact governing successful scientific careers, it may also help us unearth the fundamental principles underlying creativity and the genesis of scientific breakthroughs.Design Type(s)data integration objective • source-based data analysis objective • metadata search and retrieval objectiveMeasurement Type(s)publicationTechnology Type(s)digital curationFactor Type(s)Knowledge Field • temporal_instantSample Characteristic(s)Machine-accessible metadata file describing the reported data (ISA-Tab format)

[1]  Yuxiao Dong,et al.  A Century of Science: Globalization of Scientific Collaborations, Citations, and Innovations , 2017, KDD.

[2]  J. I. Seeman,et al.  Synthesis and the Nobel Prize in Chemistry , 2017, Nature Chemistry.

[3]  Jiang Li,et al.  Sleeping beauties in genius work: When were they awakened? , 2016, J. Assoc. Inf. Sci. Technol..

[4]  Santo Fortunato,et al.  Prizes: Growing time lag threatens Nobels , 2014, Nature.

[5]  Marcos André Gonçalves,et al.  A brief survey of automatic methods for author name disambiguation , 2012, SGMD.

[6]  Ho Fai Chan,et al.  The first cut is the deepest: repeated interactions of coauthorship and academic productivity in Nobel laureate teams , 2015, Scientometrics.

[7]  Benjamin F. Jones,et al.  Age dynamics in scientific creativity , 2011, Proceedings of the National Academy of Sciences.

[8]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[9]  Katy Börner,et al.  ‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers , 2014, Scientometrics.

[10]  Daniel B. Larremore,et al.  Systematic inequality and hierarchy in faculty hiring networks , 2015, Science Advances.

[11]  M. Heinemann The Matthew Effect , 2016, Thoracic and Cardiovascular Surgeon.

[12]  Santo Fortunato,et al.  How Citation Boosts Promote Scientific Paradigm Shifts and Nobel Prizes , 2011, PloS one.

[13]  H. Zuckerman,et al.  The sociology of the Nobel prizes. , 1967, Scientific American.

[14]  Daniel B. Larremore,et al.  The misleading narrative of the canonical faculty productivity trajectory , 2016, Proceedings of the National Academy of Sciences.

[15]  Santo Fortunato,et al.  Nobel laureates are almost the same as us , 2019, Nature Reviews Physics.

[16]  Jasjit Singh,et al.  Lone Inventors as Source of Breakthroughs: Myth or Reality? , 2009, Manag. Sci..

[17]  Andrea Bergmann,et al.  Citation Indexing Its Theory And Application In Science Technology And Humanities , 2016 .

[18]  Daniel M. Romero,et al.  The nearly universal link between the age of past knowledge and tomorrow’s breakthroughs in science and technology: The hotspot , 2017, Science Advances.

[19]  Yang Song,et al.  An Overview of Microsoft Academic Service (MAS) and Applications , 2015, WWW.

[20]  Pierre Azoulay,et al.  Effect or Fable ? , 2012 .

[21]  Albert-László Barabási,et al.  Collective credit allocation in science , 2014, Proceedings of the National Academy of Sciences.

[22]  Yang Wang,et al.  Hot streaks in artistic, cultural, and scientific careers , 2017, Nature.

[23]  R. Jackson,et al.  The Matthew Effect in Science , 1988, International journal of dermatology.

[24]  Claudio Castellano,et al.  Universality of citation distributions: Toward an objective measure of scientific impact , 2008, Proceedings of the National Academy of Sciences.

[25]  Carl T. Bergstrom,et al.  The Science of Science , 2018, Science.

[26]  Viet-Phuong La,et al.  Descriptor : An open database of productivity in Vietnam ' s social sciences and humanities for public use , 2018 .

[27]  A. Barabasi,et al.  Quantifying the evolution of individual scientific impact , 2016, Science.

[28]  Benjamin F. Jones,et al.  Supporting Online Material Materials and Methods Figs. S1 to S3 References the Increasing Dominance of Teams in Production of Knowledge , 2022 .

[29]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[30]  L. Fleming,et al.  Collaborative Brokerage, Generative Creativity, and Creative Success , 2007 .

[31]  Pierre Azoulay,et al.  Matthew: Effect or Fable? , 2012, Manag. Sci..

[32]  Galina F. Gordukalova : Scientific Elite: Nobel Laureates in the United States , 1997 .

[33]  Luis A. Nunes Amaral,et al.  The Distribution of the Asymptotic Number of Citations to Sets of Publications by a Researcher or from an Academic Department Are Consistent with a Discrete Lognormal Model , 2015, PloS one.

[34]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[35]  Benjamin F. Jones The Burden of Knowledge and the &Apos;Death of the Renaissance Man&Apos;: Is Innovation Getting Harder? , 2005 .

[36]  James A. Evans,et al.  Large teams develop and small teams disrupt science and technology , 2019, Nature.

[37]  Dean Keith Simonton,et al.  Genius, Creativity, and Leadership: Historiometric Inquiries , 1984 .

[38]  Dean Keith Simonton,et al.  Creative productivity: A predictive and explanatory model of career trajectories and landmarks. , 1997 .

[39]  M. Newman Coauthorship networks and patterns of scientific collaboration , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Stasa Milojevic,et al.  Principles of scientific research team formation and evolution , 2014, Proceedings of the National Academy of Sciences.

[41]  H. Stanley,et al.  The science of science: from the perspective of complex systems , 2017 .

[42]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[43]  Greg Morrison,et al.  Disambiguation of patent inventors and assignees using high-resolution geolocation data , 2015, Scientific Data.

[44]  Harry Eugene Stanley,et al.  Reputation and impact in academic careers , 2013, Proceedings of the National Academy of Sciences.

[45]  Thorsten Halling,et al.  Boost sustainability through social justice in China’s Belt and Road Initiative , 2018, Nature.

[46]  Neil R. Smalheiser,et al.  Author name disambiguation , 2009, Annu. Rev. Inf. Sci. Technol..

[47]  Ho Fai Chan,et al.  Science prizes: Time-lapsed awards for excellence , 2013, Nature.

[48]  H. Zuckerman Nobel laureates in science: patterns of productivity, collaboration, and authorship. , 1967, American sociological review.

[49]  Pierre Azoulay,et al.  Toward a more scientific science , 2018, Science.

[50]  W. Myers,et al.  Atypical Combinations and Scientific Impact , 2013 .