The Diversity–Innovation Paradox in Science

Significance By analyzing data from nearly all US PhD recipients and their dissertations across three decades, this paper finds demographically underrepresented students innovate at higher rates than majority students, but their novel contributions are discounted and less likely to earn them academic positions. The discounting of minorities’ innovations may partly explain their underrepresentation in influential positions of academia. Prior work finds a diversity paradox: Diversity breeds innovation, yet underrepresented groups that diversify organizations have less successful careers within them. Does the diversity paradox hold for scientists as well? We study this by utilizing a near-complete population of ∼1.2 million US doctoral recipients from 1977 to 2015 and following their careers into publishing and faculty positions. We use text analysis and machine learning to answer a series of questions: How do we detect scientific innovations? Are underrepresented groups more likely to generate scientific innovations? And are the innovations of underrepresented groups adopted and rewarded? Our analyses show that underrepresented groups produce higher rates of scientific novelty. However, their novel contributions are devalued and discounted: For example, novel contributions by gender and racial minorities are taken up by other scholars at lower rates than novel contributions by gender and racial majorities, and equally impactful contributions of gender and racial minorities are less likely to result in successful scientific careers than for majority groups. These results suggest there may be unwarranted reproduction of stratification in academic careers that discounts diversity’s role in innovation and partly explains the underrepresentation of some groups in academia.

[1]  J. Ravetz Sociology of Science , 1972, Nature.

[2]  W. Wiegand : The System of Professions: An Essay on the Division of Expert Labor , 1990 .

[3]  Balazs Vedres,et al.  Game Changer: The Topology of Creativity1 , 2015, American Journal of Sociology.

[4]  Shilad Sen,et al.  Gender Representation on Journal Editorial Boards in the Mathematical Sciences , 2016, PloS one.

[5]  D. Groneberg,et al.  Gender disparities in high-quality research revealed by Nature Index journals , 2018, PloS one.

[6]  Edoardo M. Airoldi,et al.  Summarizing topical content with word frequency and exclusivity , 2012, ICML 2012.

[7]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[8]  Hui Xiong,et al.  Adapting the right measures for K-means clustering , 2009, KDD.

[9]  L. Schiebinger The Mind Has No Sex?: Women in the Origins of Modern Science , 1991 .

[10]  M. Weitzman,et al.  Recombinant Growth , 2009 .

[11]  Rense Corten,et al.  Sources of Segregation in Social Networks: A Novel Approach Using Facebook , 2017 .

[12]  J Schüpbach,et al.  Antibodies reactive with human T-lymphotropic retroviruses (HTLV-III) in the serum of patients with AIDS. , 1984, Science.

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  David Stark,et al.  The Sense of Dissonance: Accounts of Worth in Economic Life , 2009 .

[15]  Henry G. Small,et al.  On the shoulders of Robert Merton: Towards a normative theory of citation , 2004, Scientometrics.

[16]  Vito Latora,et al.  Network dynamics of innovation processes , 2017, Physical review letters.

[17]  Daniel Jurafsky,et al.  Citation-based bootstrapping for large-scale author disambiguation , 2012, J. Assoc. Inf. Sci. Technol..

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[20]  Martijn J. Schuemie,et al.  Distribution of information in biomedical abstracts and full-text publications , 2004, Bioinform..

[21]  L. Daston The Mind Has No Sex? Women in the Origins of Modem Science. Londa Schiebinger. Harvard University Press, Cambridge, MA, 1989. xii, 355 pp., illus. $29.50. , 1989, Science.

[22]  M. Cosman,et al.  The Mind Has No Sex? Women in the Origins of Modern Science , 1992 .

[23]  Erin E Leahey,et al.  Sociological Innovation through Subfield Integration , 2014 .

[24]  Andrey Rzhetsky,et al.  Tradition and Innovation in Scientists’ Research Strategies , 2013, ArXiv.

[25]  Daniel Jurafsky,et al.  Measuring the Evolution of a Scientific Field through Citation Frames , 2018, TACL.

[26]  Gaurav Sood,et al.  Predicting Race and Ethnicity From the Sequence of Characters in a Name , 2018, 1805.02109.

[27]  Fabio Stella,et al.  Topic model validation , 2012, Neurocomputing.

[28]  David G. Rand,et al.  Structural Topic Models for Open‐Ended Survey Responses , 2014, American Journal of Political Science.

[29]  Carl T. Bergstrom,et al.  The Science of Science , 2018, Science.

[30]  Anna D. Muncy,et al.  Gender differences in patterns of authorship do not affect peer review outcomes at an ecology journal , 2016 .

[31]  Holly J. Falk-Krzesinski,et al.  Opinion: Gender diversity leads to better science , 2017, Proceedings of the National Academy of Sciences.

[32]  Marco R. Spruit,et al.  Full-Text or Abstract? Examining Topic Coherence Scores Using Latent Dirichlet Allocation , 2017, 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[33]  Simon DeDeo,et al.  Individuals, institutions, and innovation in the debates of the French Revolution , 2017, Proceedings of the National Academy of Sciences.

[34]  M. Graham,et al.  Science faculty’s subtle gender biases favor male students , 2012, Proceedings of the National Academy of Sciences.

[35]  V. Burris The Academic Caste System: Prestige Hierarchies in PhD Exchange Networks , 2004 .

[36]  Miguel A. Andrade-Navarro,et al.  Information extraction from full text scientific articles: Where are the keywords? , 2003, BMC Bioinformatics.

[37]  Cindy E. Hauser,et al.  The gender gap in science: How long until women are equally represented? , 2018, PLoS biology.

[38]  Waverly W. Ding,et al.  Gender Differences in Patenting in the Academic Life Sciences , 2006, Science.

[39]  Clare R. Voss,et al.  Scalable Topical Phrase Mining from Text Corpora , 2014, Proc. VLDB Endow..

[40]  Margaret E. Roberts,et al.  A Model of Text for Experimentation in the Social Sciences , 2016 .

[41]  Denis Trapido,et al.  How novelty in knowledge earns recognition: The role of consistent identities , 2015 .

[42]  Justin Grimmer An Introduction to Bayesian Inference via Variational Approximations , 2011, Political Analysis.

[43]  W. Myers,et al.  Atypical Combinations and Scientific Impact , 2013 .

[44]  H. Zuckerman The sociology of science. , 1988 .

[45]  Hong Yu,et al.  Accessing bioscience images from abstract sentences , 2006, ISMB.

[46]  Daniel B. Larremore,et al.  Systematic inequality and hierarchy in faculty hiring networks , 2015, Science Advances.

[47]  J. P. Andersen,et al.  One and a half million medical papers reveal a link between author gender and attention to gender and sex analysis , 2017, Nature Human Behaviour.

[48]  Carl T. Bergstrom,et al.  The Role of Gender in Scholarly Authorship , 2012, PloS one.

[49]  A. Agresti An introduction to categorical data analysis , 1997 .

[50]  Gerard Mourou,et al.  Compression of amplified chirped optical pulses , 1985 .

[51]  D. Stark,et al.  Game Changer: The Topology of Creativity1 , 2015, American Journal of Sociology.

[52]  Lauren A. Rivera When Two Bodies Are (Not) a Problem: Gender and Relationship Status Discrimination in Academic Hiring , 2017 .

[53]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[54]  Carlos Ramisch,et al.  An Evaluation of Methods for the Extraction of Multiword Expressions , 2008, LREC 2008.

[55]  R. Burt Structural Holes and Good Ideas1 , 2004, American Journal of Sociology.

[56]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[57]  Anton J. Villado,et al.  Getting Specific about Demographic Diversity Variable and Team Performance Relationships: A Meta-Analysis , 2011 .

[58]  Isidoro Gil-Leiva,et al.  Keywords given by authors of scientific articles in database descriptors , 2007 .