Appropriate similarity measures for author co-citation analysis

We provide in this article a number of new insights into the methodological discussion about author co-citation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors' co-citation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. We show by means of an example that the choice of an appropriate similarity measure has a high practical relevance. Finally, we discuss the use of similarity measures for statistical inference. © 2008 Wiley Periodicals, Inc.

[1]  Loet Leydesdorff,et al.  Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment , 2006 .

[2]  Ludo Waltman,et al.  Some Comments on the Question Whether Co-Occurrence Data Should Be Normalized , 2007, J. Assoc. Inf. Sci. Technol..

[3]  Dangzhi Zhao,et al.  Towards all-author co-citation analysis , 2006, Inf. Process. Manag..

[4]  Howard D. White Replies and a correction , 2004, J. Assoc. Inf. Sci. Technol..

[5]  Sandra Miguel,et al.  A new approach to institutional domain analysis: Multilevel research fronts structure , 2007, Scientometrics.

[6]  Alesia Zuccalá,et al.  Author Cocitation Analysis is to intellectual structure as Web Colink Analysis is to ...? , 2006, J. Assoc. Inf. Sci. Technol..

[7]  Sean Eom,et al.  All author cocitation analysis and first author cocitation analysis: A comparative empirical investigation , 2008, J. Informetrics.

[8]  G. T. Duncan,et al.  A Monte-Carlo study of asymptotically robust tests for correlation coefficients , 1973 .

[9]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[10]  Loet Leydesdorff,et al.  Should co-occurrence data be normalized? A rejoinder: Letter to Editor , 2007 .

[11]  Stephen J. Bensman Pearson's r and author cocitation analysis: A commentary on the controversy , 2004, J. Assoc. Inf. Sci. Technol..

[12]  Loet Leydesdorff Similarity measures, author cocitation analysis, and information theory: Brief Communication , 2005 .

[13]  Evaristo Jiménez-Contreras,et al.  A connectionist and multivariate approach to science maps: the SOM, clustering and MDS applied to library and information science research , 2006, J. Inf. Sci..

[14]  E. S. Pearson The Test of Significance for the Correlation Coefficient , 1931 .

[15]  Zao Liu,et al.  Visualizing the intellectual structure in urban studies: A journal co-citation analysis (1992-2002) , 2005, Scientometrics.

[16]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[17]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[18]  Katherine W. McCain,et al.  Mapping authors in intellectual space: A technical overview , 1990, J. Am. Soc. Inf. Sci..

[19]  Katherine W. McCain,et al.  Visualizing a discipline: an author co-citation analysis of information science, 1972–1995 , 1998 .

[20]  Howard D. White,et al.  Author cocitation: A literature measure of intellectual structure , 1981, J. Am. Soc. Inf. Sci..

[21]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[22]  Loet Leydesdorff,et al.  Co-words and citations relations between document sets and environments , 1988 .

[23]  Ludo Waltman,et al.  Bibliometric Mapping of the Computational Intelligence Field , 2007, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[24]  K. McCain Mapping Economics through the Journal Literature: An Experiment in Journal Cocitation Analysis. , 1991 .

[25]  C. Kowalski On the Effects of Non‐Normality on the Distribution of the Sample Product‐Moment Correlation Coefficient , 1972 .

[26]  Jesper W. Schneider,et al.  Matrix comparison, Part 1: Motivation and important issues for measuring the resemblance between proximity measures or ordination results , 2007 .

[27]  Loet Leydesdorff,et al.  Similarity Measures, Author Cocitation Analysis, and Information Theory , 2005, J. Assoc. Inf. Sci. Technol..

[28]  Alesia Zuccala,et al.  Author Cocitation Analysis is to intellectual structure as Web Colink Analysis is to …?: Research Articles , 2006 .

[29]  June M. Verner,et al.  The use of bibliometric and knowledge elicitation techniques to map a knowledge domain: Software Engineering in the 1990s , 2005, Scientometrics.

[30]  Ronald Rousseau,et al.  Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient , 2003, J. Assoc. Inf. Sci. Technol..

[31]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[32]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[33]  Howard D. White,et al.  Author cocitation analysis and Pearson's r , 2003, J. Assoc. Inf. Sci. Technol..

[34]  Howard D. White,et al.  Pathfinder networks and author cocitation analysis: A remapping of paradigmatic information scientists , 2003, J. Assoc. Inf. Sci. Technol..

[35]  Loet Leydesdorff,et al.  Should co-occurrence data be normalized? A rejoinder , 2007, J. Assoc. Inf. Sci. Technol..