Self-citation is the hallmark of productive authors, of any gender

It was recently reported that men self-cite >50% more often than women across a wide variety of disciplines in the bibliographic database JSTOR. Here, we replicate this finding in a sample of 1.6 million papers from Author-ity, a version of PubMed with computationally disambiguated author names. More importantly, we show that the gender effect largely disappears when accounting for prior publication count in a multidimensional statistical model. Gender has the weakest effect on the probability of self-citation among an extensive set of features tested, including byline position, affiliation, ethnicity, collaboration size, time lag, subject-matter novelty, reference/citation counts, publication type, language, and venue. We find that self-citation is the hallmark of productive authors, of any gender, who cite their novel journal publications early and in similar venues, and more often cross citation-barriers such as language and indexing. As a result, papers by authors with short, disrupted, or diverse careers miss out on the initial boost in visibility gained from self-citations. Our data further suggest that this disproportionately affects women because of attrition and not because of disciplinary under-specialization.

[1]  Wolfgang Glänzel,et al.  A bibliometric approach to the role of author self-citations in scientific communication , 2004, Scientometrics.

[2]  J. Lorber Women Physicians: Careers, Status and Power , 1984 .

[3]  R. Jagsi,et al.  Gender variations in citation distributions in medicine are very small and due to self-citation and journal prestige , 2019, eLife.

[4]  Vetle I. Torvik,et al.  Ethnea -- an instance-based ethnicity classifier based on geo-coded author names in a large-scale bibliographic database , 2016 .

[5]  P. Bickel,et al.  Sex Bias in Graduate Admissions: Data from Berkeley , 1975, Science.

[6]  Michael H. MacRoberts,et al.  Problems of citation analysis , 1996, Scientometrics.

[7]  Laurie A. Rudman,et al.  Self-promotion as a risk factor for women: the costs and benefits of counterstereotypical impression management. , 1998, Journal of personality and social psychology.

[8]  A. Wysong,et al.  The proportion of male and female editors in women’s health journals: A critical analysis and review of the sex gap,☆ , 2019, International journal of women's dermatology.

[9]  James Hartley To cite or not to cite: author self-citations and the impact factor , 2011, Scientometrics.

[10]  Christina Courtright,et al.  Context in information behavior research , 2007 .

[11]  J. Watson,et al.  English, the international language of science. , 1986, Journal belge de radiologie.

[12]  Laurie A. Rudman,et al.  Disruptions in Women's Self-Promotion: The Backlash Avoidance Model 1 , 2010 .

[13]  Barbara F. Walter,et al.  The Gender Citation Gap , 2013 .

[14]  E. Leahey,et al.  Gender Differences in Productivity , 2006 .

[15]  H. Zuckerman Patterns of Name Ordering Among Authors of Scientific Papers: A Study of Social Symbolism and Its Ambiguity , 1968, American Journal of Sociology.

[16]  Jevin D. West,et al.  The Academic Advantage: Gender Disparities in Patenting , 2015, PloS one.

[17]  Dag W. Aksnes,et al.  Does self-citation pay? , 2007, Scientometrics.

[18]  P. Arlotta,et al.  Seven actionable strategies for advancing women in science, engineering, and medicine. , 2015, Cell stem cell.

[19]  Erin E Leahey,et al.  Not by Productivity Alone: How Visibility and Specialization Contribute to Academic Earnings , 2007 .

[20]  D. Chawla Men cite themselves more than women do , 2016, Nature.

[21]  Ian Brooks,et al.  Breaking the silo. Using informatics to support clinical and translational science. , 2007, Journal of healthcare information management : JHIM.

[22]  Francine D. Blau,et al.  The Gender Wage Gap: Extent, Trends, and Explanations , 2016, SSRN Electronic Journal.

[23]  E. H. Simpson,et al.  The Interpretation of Interaction in Contingency Tables , 1951 .

[24]  Koenraad Debackere,et al.  A concise review on the role of author self-citations in information science, bibliometrics and science policy , 2006, Scientometrics.

[25]  John P A Ioannidis,et al.  A generalized view of self-citation: direct, co-author, collaborative, and coercive induced self-citation. , 2015, Journal of psychosomatic research.

[26]  Barbara F. Walter,et al.  The Gender Citation Gap in International Relations , 2013, International Organization.

[27]  Glänzel Wolfgang,et al.  A bibliometric approach to the role of author self-citations in scientific communication , 2004 .

[28]  Vetle I. Torvik,et al.  Quantifying Conceptual Novelty in the Biomedical Literature , 2016, D Lib Mag..

[29]  Susan Bonzi,et al.  Motivations for citation: A comparison of self citation and citation to others , 1991, Scientometrics.

[30]  Carl T. Bergstrom,et al.  The Role of Gender in Scholarly Authorship , 2012, PloS one.

[31]  Geoff Norman,et al.  Data dredging, salami-slicing, and other successful strategies to ensure rejection: twelve tips on how to not get your paper published , 2014, Advances in Health Sciences Education.

[32]  Dag W. Aksnes,et al.  Publication rate expressed by age, gender and academic position - A large-scale analysis of Norwegian academic staff , 2015, J. Informetrics.

[33]  Neil R. Smalheiser,et al.  Three Journal Similarity Metrics and Their Application to Biomedical Journals , 2014, PloS one.

[34]  Bruce A. Weinberg,et al.  STEM Training and Early Career Outcomes of Female and Male Graduate Students: Evidence from UMETRICS Data linked to the 2010 Census. , 2016, The American economic review.

[35]  S. Ceci,et al.  Women in Academic Science , 2014, Psychological science in the public interest : a journal of the American Psychological Society.

[36]  Elissa Z. Cameron,et al.  Solving the Productivity and Impact Puzzle: Do Men Outperform Women, or are Metrics Biased? , 2016 .

[37]  Brent D. Fegley,et al.  Introducing the Author-ity Exporter, and a case study of geo-temporal movement of authors , 2016 .

[38]  Vetle I. Torvik,et al.  MapAffil: A Bibliographic Tool for Mapping Author Affiliation Strings to Cities and Their Geocodes Worldwide , 2015, D Lib Mag..

[39]  Thed N. van Leeuwen,et al.  Self-citations at the meso and individual levels: effects of different calculation methods , 2010, Scientometrics.

[40]  C. Lee Giles,et al.  ParsCit: an Open-source CRF Reference String Parsing Package , 2008, LREC.

[41]  Scott R. Hutson Self-Citation in Archaeology: Age, Gender, Prestige, and the Self , 2006 .

[42]  Carl T. Bergstrom,et al.  Men Set Their Own Cites High: Gender and Self-citation across Fields and over Time , 2016, ArXiv.

[43]  Dag W. Aksnes,et al.  A macro study of self-citation , 2003, Scientometrics.

[44]  M. Sauer,et al.  A Bibliometric Analysis of Top-Cited Journal Articles in Obstetrics and Gynecology , 2019, JAMA network open.

[45]  C. Blyth On Simpson's Paradox and the Sure-Thing Principle , 1972 .

[46]  Plergiorgio Strata,et al.  Citation analysis , 1995, Nature.

[47]  Vetle I. Torvik,et al.  A search engine approach to estimating temporal changes in gender orientation of first names , 2013, JCDL '13.

[48]  Neil R. Smalheiser,et al.  Author name disambiguation in MEDLINE , 2009, TKDD.

[49]  Iman Tahamtan,et al.  Factors affecting number of citations: a comprehensive review of the literature , 2016, Scientometrics.

[50]  N. Smalheiser,et al.  Author-ity 2009 - PubMed author name disambiguated dataset , 2018 .

[51]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[52]  Neil R. Smalheiser,et al.  A probabilistic similarity metric for Medline records: A model for author name disambiguation , 2005, J. Assoc. Inf. Sci. Technol..

[53]  Andrea Bergmann,et al.  Citation Indexing Its Theory And Application In Science Technology And Humanities , 2016 .

[54]  Cassidy R. Sugimoto,et al.  Bibliometrics: Global gender disparities in science , 2013, Nature.

[55]  Vetle I. Torvik,et al.  MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide , 2018 .