Same data—different results? Towards a comparative approach to the identification of thematic structures in science

Science studies are persistently challenged by the elusive structures of their subject matter, be it scientific knowledge or the various collectivities of researchers engaged with its production. Bibliometrics has responded by developing a strong and growing structural bibliometrics, which is concerned with delineating fields and identifying thematic structures. In the course of these developments, a concern emerged and is steadily growing. Do the sets of publications, authors or institutions we identify and visualise with our methods indeed represent thematic structures? To what extent are results of topic identification exercises determined by properties of knowledge structures, and to what extent are they determined by the approaches we use? Do we produce more than artefacts? These questions triggered the collective process of comparative topic identification reported in this special issue. The introduction traces the history of bibliometric approaches to topic identification, identifies the major challenges involved in these exercises, and introduces the contributions to the special issue.

[1]  Frank Havemann,et al.  Epistemic Diversity as Distribution of Paper Dissimilarities , 2015, ISSI.

[2]  Bart De Moor,et al.  Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis , 2007, KDD '07.

[3]  S. W. Woolgar The Identification and Definition of Scientific Collectivities , 1976 .

[4]  Andrea Scharnhorst,et al.  Contextualization of topics: browsing through the universe of bibliographic information , 2017, Scientometrics.

[5]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[6]  A. Vanraan,et al.  Fractal dimension of co-citations , 1990, Nature.

[7]  Daniele Rotolo,et al.  Bibliometric perspectives on medical innovation using the medical subject Headings of PubMed , 2012, J. Assoc. Inf. Sci. Technol..

[8]  S. Crawford,et al.  Informal communication among scientists in sleep research , 1971 .

[9]  Ludo Waltman,et al.  A new methodology for constructing a publication-level classification system of science , 2012, J. Assoc. Inf. Sci. Technol..

[10]  G. N. Gilbert,et al.  Problem Areas and Research Networks in Science , 1975 .

[11]  Alan L. Porter,et al.  Clustering scientific documents with topic modeling , 2014, Scientometrics.

[12]  Wolfgang Glänzel,et al.  A new methodological approach to bibliographic coupling and its application to the national, regional and institutional level , 2005, Scientometrics.

[13]  J. S. Katz,et al.  The self-similar science system , 1999 .

[14]  B. C. Griffith,et al.  The Structure of Scientific Literatures II: Toward a Macro- and Microstructure for Science , 1974 .

[15]  Ismael Rafols,et al.  A global map of science based on the ISI subject categories , 2009 .

[16]  Ludo Waltman,et al.  Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods , 2015, PloS one.

[17]  Arie Rip,et al.  Co-word maps of biotechnology: An example of cognitive scientometrics , 1984, Scientometrics.

[18]  Kevin W. Boyack,et al.  Comparison of topic extraction approaches and their results , 2017, Scientometrics.

[19]  Loet Leydesdorff,et al.  Professional and citizen bibliometrics: complementarities and ambivalences in the development and use of indicators—a state-of-the-art report , 2016, Scientometrics.

[20]  R. Tijssen A quantitative assessment of interdisciplinary structures in science and technology: Co-classification analysis of energy research☆ , 1992 .

[21]  M. Callon,et al.  From translations to problematic networks: An introduction to co-word analysis , 1983 .

[22]  Kevin W. Boyack,et al.  Investigating the effect of global data on topic detection , 2017, Scientometrics.

[23]  Kevin W. Boyack,et al.  Using global mapping to create more accurate document-level maps of research fields , 2011, J. Assoc. Inf. Sci. Technol..

[24]  Howard D. White,et al.  Author cocitation: A literature measure of intellectual structure , 1981, J. Am. Soc. Inf. Sci..

[25]  Ed C. M. Noyons,et al.  Monitoring scientific developments from a dynamic perspective: self-organized structuring to map neural network research , 1998 .

[26]  Jean Pierre Courtial,et al.  Policy and the mapping of scientific change: A co-word analysis of research into environmental acidification , 1988, Scientometrics.

[27]  Kevin W. Boyack,et al.  Which Type of Citation Analysis Generates the Most Accurate Taxonomy of Scientific and Technical Knowledge? , 2015, J. Assoc. Inf. Sci. Technol..

[28]  Loet Leydesdorff,et al.  Various methods for the mapping of science , 1987, Scientometrics.

[29]  Leto Peel,et al.  The ground truth about metadata and community detection in networks , 2016, Science Advances.

[30]  Loet Leydesdorff,et al.  of Science , 2022 .

[31]  David O. Edge,et al.  Astronomy Transformed: The Emergence of Radio Astronomy in Britain , 1978 .

[32]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[33]  Bart De Moor,et al.  Combining full text and bibliometric information in mapping scientific disciplines , 2005, Inf. Process. Manag..

[34]  H. M. Collins,et al.  The Seven Sexes: A Study in the Sociology of a Phenomenon, or the Replication of Experiments in Physics , 1975 .

[35]  Carl Lagoze,et al.  Mapping the cognitive structure of astrophysics by infomap clustering of the citation network and topic affinity analysis , 2017, Scientometrics.

[36]  Katherine W. McCain,et al.  Visualizing a discipline: an author co-citation analysis of information science, 1972–1995 , 1998 .

[37]  Bo Jarneving The cognitive structure of current cardiovascular research , 2004, Scientometrics.

[38]  Alberto Cambrosio,et al.  “Going Monoclonal”: Art, Science, and Magic in the Day-to-Day Use of Hybridoma Technology , 1988 .

[39]  Diana Crane,et al.  Invisible colleges. Diffusion of knowledge in scientific communities , 1972, Medical History.

[40]  Michel Zitt,et al.  Delineating complex scientific fields by an hybrid lexical-citation method: An application to nanosciences , 2006, Inf. Process. Manag..

[41]  Henk F. Moed,et al.  The application of bibliometric indicators: Important field- and time-dependent factors to be considered , 1985, Scientometrics.

[42]  Rob Koopman,et al.  Mutual information based labelling and comparing clusters , 2017, Scientometrics.

[43]  Wolfgang Glänzel,et al.  Little scientometrics, big scientometrics ... and beyond? , 1994, Scientometrics.

[44]  Frank Havemann,et al.  Memetic search for overlapping topics based on a local evaluation of link communities , 2017, Scientometrics.

[45]  Santo Fortunato,et al.  Community detection in networks: Structural communities versus ground truth , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[46]  Rob Koopman,et al.  Clustering articles based on semantic similarity , 2017, Scientometrics.

[47]  R. Whitley The Intellectual and Social Organization of the Sciences (Second Edition: with new introductory chapter entitled 'Science Transformed? The Changing Nature of Knowledge Production at the End of the Twentieth Century') , 1984 .

[48]  Ludo Waltman,et al.  Citation-based clustering of publications using CitNetExplorer and VOSviewer , 2017, Scientometrics.

[49]  Loet Leydesdorff,et al.  The delineation of specialties in terms of journals using the dynamic journal set of the SCI , 2005, Scientometrics.

[50]  B. Michelet,et al.  Using bibliometrics in strategic analysis: “understanding chemical reactions” at the CNRS , 1991, Scientometrics.

[51]  B. Verspagen,et al.  The Invisible College of The Economics of Innovation and Technological Change , 2003 .

[52]  Loet Leydesdorff,et al.  Clusters and Maps of Science Journals Based on Bi-connected Graphs in the Journal Citation Reports , 2009, ArXiv.

[53]  Richard P. Smiraglia Domain Analysis of Domain Analysis for Knowledge Organization: Observations on an Emergent Methodological Cluster , 2015 .

[54]  Anthony F. J. van Raan,et al.  On Growth, Ageing, and Fractal Differentiation of Science , 2000, Scientometrics.

[55]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[56]  Henk F. Moed,et al.  Citation Analysis in Research Evaluation , 1899 .

[57]  Wolfgang Glänzel,et al.  A bibliometric study on ageing and reception processes of scientific literature , 1995, J. Inf. Sci..

[58]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[59]  Vanessa Ratten,et al.  A co-citation bibliometric analysis of strategic management research , 2016, Scientometrics.

[60]  Henk F. Moed,et al.  Proceedings of the workshop on "Bibliometric Standards" (June 11, 1995, Rosary College, River Forest) , 1996 .

[61]  Harry Rothman,et al.  An experiment in science mapping for research planning , 1986 .

[62]  Loet Leydesdorff,et al.  Replicability and the public/private divide , 2016, J. Assoc. Inf. Sci. Technol..

[63]  D. Chubin State of the Field The Conceptualization of Scientific Specialties , 1976 .

[64]  W. G Zel THE NEED FOR STANDARDS IN BIBLIOMETRIC RESEARCH AND TECHNOLOGY , 1996 .

[65]  Kevin W. Boyack,et al.  Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? , 2010 .

[66]  Loet Leydesdorff,et al.  The development of frames of references , 1986, Scientometrics.

[67]  Wolfgang Glänzel,et al.  Using hybrid methods and ‘core documents’ for the representation of clusters and topics: the astronomy dataset , 2017, Scientometrics.

[68]  Wolfgang Glänzel,et al.  Topic identification challenge , 2017, Scientometrics.

[69]  H. M. Collins,et al.  The TEA Set: Tacit Knowledge and Scientific Networks , 1974 .

[70]  Anthony F. J. van Raan,et al.  The neural net of neural network research , 2005, Scientometrics.

[71]  B. C. Griffith,et al.  The Structure of Scientific Literatures I: Identifying and Graphing Specialties , 1974 .

[72]  H. Small A Co-Citation Model of a Scientific Specialty: A Longitudinal Study of Collagen Research , 1977 .

[73]  H. Moed Citation Analysis in Research Evaluation (Information Science & Knowledge Management) , 2005 .

[74]  Ed C. M. Noyons,et al.  Bibliometric mapping of science in a policy context , 2004, Scientometrics.

[75]  Kevin W. Boyack,et al.  Mapping the backbone of science , 2004, Scientometrics.

[76]  D. Price,et al.  Little Science, Big Science and Beyond , 1986 .

[77]  Wolfgang Glänzel,et al.  The need for standards in bibliometric research and technology , 2005, Scientometrics.

[78]  Kevin W. Boyack Thesaurus-based methods for mapping contents of publication sets , 2017, Scientometrics.

[79]  M. M. Kessler Bibliographic coupling between scientific papers , 1963 .

[80]  H. Small,et al.  Identifying emerging topics in science and technology , 2014 .

[81]  C. M. Noyons Science Maps Within a Science Policy Context , 2004 .

[82]  Samuel Schiminovich Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm , 1971, Inf. Storage Retr..