Mining patterns of author orders in scientific publications

The author order of multi-authored papers can reveal subtle patterns of scientific collaboration and provide insights on the nature of credit assignment among coauthors. This article proposes a sequence-based perspective on scientific collaboration. Using frequently occurring sequences as the unit of analysis, this study explores (1) what types of sequence patterns are most common in the scientific collaboration at the level of authors, institutions, U.S. states, and nations in Library and Information Science (LIS); and (2) the productivity (measured by number of papers) and influence (measured by citation counts) of different types of sequence patterns. Results show that (1) the productivity and influence approximately follow the power law for frequent sequences in the four levels of analysis; (2) the productivity and influence present a significant positive correlation among frequent sequences, and the strength of the correlation increases with the level of integration; (3) for author-level, institution-level, and state-level frequent sequences, short geographical distances between the authors usually co-present with high productivities, while long distances tend to co-occur with large citation counts; (4) for author-level frequent sequences, the pattern of “the more productive and prestigious authors ranking ahead” is the one with the highest productivity and the highest influence; however, in the rest of the levels of analysis, the pattern with the highest productivity and the highest influence is the one with “the less productive and prestigious institutions/states/nations ranking ahead.”

[1]  James W. Endersby Collaborative research in the social sciences : Multiple authorship and publication credit , 1996 .

[2]  J. Sylvan Katz,et al.  Geographical proximity and scientific collaboration , 1994, Scientometrics.

[3]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[4]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[5]  K. Burman,et al.  "Hanging from the masthead": reflections on authorship. , 1982, Annals of internal medicine.

[6]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[7]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[8]  Mohammed J. Zaki Efficient enumeration of frequent sequences , 1998, CIKM '98.

[9]  Dean M. Schroeder,et al.  “Only if I'm First Author”: Conflict over Credit in Management Scholarship , 1994 .

[10]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[11]  William F. Laurance,et al.  Second thoughts on who goes where in author lists , 2006, Nature.

[12]  D. Rennie,et al.  Authorship! Authorship! Guests, ghosts, grafters, and the two-sided coin. , 1994, JAMA.

[13]  M. Fine,et al.  Reflections on determining authorship credit and authorship order on faculty-student collaborations , 1993 .

[14]  Ken Peffers,et al.  Collaboration and Author Order: Changing Patterns in IS Research , 2003, Commun. Assoc. Inf. Syst..

[15]  Eldon Y. Li,et al.  Co-authorship networks and research impact: A social capital perspective , 2013 .

[16]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[17]  Bing He,et al.  Mining enriched contextual information of scientific collaboration: A meso perspective , 2011, J. Assoc. Inf. Sci. Technol..

[18]  James G. Hunt,et al.  Content, Process, and the Matthew Effect Among Management Academics , 1987 .

[19]  G D Lundberg,et al.  The order of authorship: who's on first? , 1990, JAMA.

[20]  D. Rennie,et al.  When authorship fails. A proposal to make contributors accountable. , 1997, JAMA.

[21]  Gesellschaft für Klassifikation. Jahrestagung,et al.  Advances in Data Analysis, Proceedings of the 30th Annual Conference of the Gesellschaft für Klassifikation e.V., Freie Universität Berlin, March 8-10, 2006 , 2007, GfKl.

[22]  M. Hochberg,et al.  Author Sequence and Credit for Contributions in Multiauthored Publications , 2007, PLoS biology.

[23]  Kurt Hornik,et al.  Building on the Arules Infrastructure for Analyzing Transaction Data with R , 2006, GfKl.

[24]  Peter R. Fontana,et al.  Author Order and Research Quality , 1977 .

[25]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[26]  H. Zuckerman Patterns of Name Ordering Among Authors of Scientific Papers: A Study of Social Symbolism and Its Ambiguity , 1968, American Journal of Sociology.

[27]  Cassidy R. Sugimoto,et al.  Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks , 2011, J. Assoc. Inf. Sci. Technol..

[28]  George Tomlinson,et al.  The Meaning of Author Order in Medical Research , 2007, Journal of Investigative Medicine.

[29]  A L Baughman Re: "Invited commentary: What can we infer from author order in epidemiology?". , 1999, American journal of epidemiology.

[30]  Mary Ann Von Glinow,et al.  Ethical Standards Within Organizational behavior , 1982 .