Can citation analysis of Web publications better detect research fronts?

We present evidence that in some research fields, research published in journals and reported on the Web may collectively represent different evolutionary stages of the field, with journals lagging a few years behind the Web on average, and that a "two-tier" scholarly communication system may therefore be evolving. We conclude that in such fields, (a) for detecting current research fronts, author co-citation analyses (ACA) using articles published on the Web as a data source can outperform traditional ACAs using articles published in journals as data, and that (b) as a result, it is important to use multiple data sources in citation analysis studies of scholarly communication for a complete picture of communication patterns. Our evidence stems from comparing the respective intellectual structures of the XML research field, a subfield of computer science, as revealed from three sets of ACA covering two time periods: (a) from the field's beginnings in 1996 to 2001, and (b) from 2001 to 2006. For the first time period, we analyze research articles both from journals as indexed by the Science Citation Index (SCI) and from the Web as indexed by CiteSeer. We follow up by an ACA of SCI data for the second time period. We find that most trends in the evolution of this field from the first to the second time period that we find when comparing ACA results from the SCI between the two time periods already were apparent in the ACA results from CiteSeer during the first time period.

[1]  Ronald Rousseau,et al.  Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient , 2003, J. Assoc. Inf. Sci. Technol..

[2]  Dangzhi Zhao,et al.  Web-based and print journal-based scholarly communication in the XML research field: A look at the intellectual structure , 2005, ASIST.

[3]  Duncan Lindsey,et al.  Production and Citation Measures in the Sociology of Science: The Problem of Multiple Authorship , 1980 .

[4]  Howard D. White,et al.  Authors as markers of Intellectual Space: Co‐citation in studies of Science, Technology and Society , 1982, J. Documentation.

[5]  Jonathan Furner,et al.  Scholarly communication and bibliometrics , 2005, Annu. Rev. Inf. Sci. Technol..

[6]  Blaise Cronin,et al.  Invoked on the Web , 1998, J. Am. Soc. Inf. Sci..

[7]  Katherine W. McCain,et al.  Cocited author mapping as a valid representation of intellectual structure , 1986, J. Am. Soc. Inf. Sci..

[8]  Ronald Rousseau,et al.  Author cocitation analysis and Pearson's r , 2004, J. Assoc. Inf. Sci. Technol..

[9]  Elisabeth Logan,et al.  Citation analysis using scientific publications on the Web as data source: A case study in the XML research area , 2002, Scientometrics.

[10]  Henry Kreuzman,et al.  A co-citation analysis of representative authors in philosophy: Examining the relationship between epistemologists and philosophers of science , 2001, Scientometrics.

[11]  BUTTONWOOD POND,et al.  Data Collection Methods , 2014, Encyclopedia of Social Network Analysis and Mining.

[12]  Jeff White Readings in agents , 1998 .

[13]  K. McCain Cocited author mapping as a valid representation of intellectual structure , 1986 .

[14]  Mary J. Culnan,et al.  Mapping the Intellectual Structure of MIS, 1980-1985: A Co-Citation Analysis , 1987, MIS Q..

[15]  R. Rousseau Sitations: an exploratory study , 1997 .

[16]  Mary J. Culnan,et al.  The intellectual development of management information systems, 1972-1982: a co-citation analysis , 1986 .

[17]  Henry Kreuzman,et al.  A co-citation analysis of representative authors in philosophy: Examining the relationship between epistemologists and philosophers of science , 2004, Scientometrics.

[18]  Katherine W. McCain,et al.  Visualizing a discipline: an author co-citation analysis of information science, 1972–1995 , 1998 .

[19]  Ray R. Larson,et al.  Bibliometrics of the World Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace , 1996 .

[20]  Katherine W. McCain Sharing digitized research-related information on the World Wide Web , 2000, J. Am. Soc. Inf. Sci..

[21]  Stephen P. Harter,et al.  Psychological Relevance and Information Science , 1992, J. Am. Soc. Inf. Sci..

[22]  Leo Egghe,et al.  New informetric aspects of the Internet: some reflections - many problems , 2000, J. Inf. Sci..

[23]  Katherine W. McCain,et al.  Mapping authors in intellectual space: A technical overview , 1990, J. Am. Soc. Inf. Sci..

[24]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[25]  R. W. Poultney Front-Ends Are the Way to Go , 1996 .

[26]  Judit Bar-Ilan,et al.  Data collection methods on the Web for infometric purposes — A review and analysis , 2004, Scientometrics.

[27]  Dangzhi Zhao,et al.  Towards all-author co-citation analysis , 2006, Inf. Process. Manag..

[28]  Derek J. de Solla Price,et al.  Science Since Babylon , 1961 .

[29]  C. Lee Giles,et al.  Scholarly publishing in the Internet age: a citation analysis of computer science literature , 2001, Inf. Process. Manag..

[30]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .

[31]  Anthony F. J. van Raan,et al.  Bibliometrics and internet: Some observations and expectations , 2004, Scientometrics.

[32]  Stephen P. Harter,et al.  ARCHIVE: Electronic Journals and Scholarly Communication: A Citation and Reference Study , 1997 .

[33]  Katherine W. McCain,et al.  Mapping authors in intellectual space: A technical overview , 1990, Journal of the American Society for Information Science.

[34]  Rob Kling,et al.  Not Just a Matter of Time: Field Differences and the Shaping of Electronic Media , 1999 .

[35]  James Testa,et al.  The Thomson Scientific journal selection process. , 2006, International microbiology : the official journal of the Spanish Society for Microbiology.

[36]  D. Edge Quantitative Measures of Communication in Science: A Critical Review , 1979, History of science; an annual review of literature, research and teaching.

[37]  CHRISTINE L. BORGMAN,et al.  Digital libraries and the continuum of scholarly communication , 2000, J. Documentation.

[38]  S. P. Harter Psychological relevance and information science , 1992 .

[39]  Howard D. White,et al.  Author cocitation: A literature measure of intellectual structure , 1981, J. Am. Soc. Inf. Sci..

[40]  Dangzhi Zhao,et al.  Challenges of scholarly publications on the Web to the evaluation of science - A comparison of author visibility on the Web and in print journals , 2005, Inf. Process. Manag..

[41]  K. McCain Mapping Authors in Intellectual Space , 1989 .

[42]  Blaise Cronin,et al.  Bibliometrics and beyond: some thoughts on web-based citation analysis , 2001, J. Inf. Sci..

[43]  Liwen Vaughan,et al.  Webometrics , 2005, Annu. Rev. Inf. Sci. Technol..

[44]  Blaise Cronin,et al.  Invoked on the Web , 1998, J. Am. Soc. Inf. Sci..

[45]  Olle Persson All author citations versus first author citations , 2004, Scientometrics.