Data matching, integration, and interoperability for a metric assessment of monographs

This paper details a unique data experiment carried out at the University of Amsterdam, Center for Digital Humanities. Data pertaining to monographs were collected from three autonomous resources, the Scopus Journal Index, WorldCat.org and Goodreads, and linked according to unique identifiers in a new Microsoft SQL database. The purpose of the experiment was to investigate co-varied metrics for a list of book titles based on their citation impact (from Scopus), presence in international libraries (WorldCat.org) and visibility as publically reviewed items (Goodreads). The results of our data experiment highlighted current problems related citation indices and the way that books are recorded by different citing authors. Our research further demonstrates the primary problem of matching book titles as ‘cited objects’ with book titles held in a union library catalog, given that books are always recorded distinctly in libraries if published as separate editions with different International Standard Book Numbers (ISBNs). Due to various ‘matching’ problems related to the ISBN, we suggest a new type of identifier, a ‘Book Object Identifier’, which would allow bibliometricians to recognize a book published in multiple formats and editions as ‘one object’ suitable for evaluation. The BOI standard would be most useful for books published in the same language, and would more easily support the integration of data from different types of book indexes.

[1]  A. J. M. Linmans,et al.  Why with bibliometrics the Humanities does not need to be the weakest link , 2010, Scientometrics.

[2]  Evaristo Jiménez-Contreras,et al.  Analyzing the citation characteristics of books: edited books, book series and publisher types in the book citation index , 2013, Scientometrics.

[3]  Anton J. Nederhof,et al.  Bibliometric monitoring of research performance in the Social Sciences and the Humanities: A Review , 2006, Scientometrics.

[4]  T. V. Leeuwen Bibliometric research evaluations, Web of Science and the Social Sciences and Humanities: a problematic relationship? , 2013 .

[5]  Roberto Cornacchia,et al.  Altmetrics for the humanities: Comparing Goodreads reader ratings with citations to history books , 2015, Aslib J. Inf. Manag..

[6]  Getaneh Alemu,et al.  Towards a conceptual framework for user‐driven semantic metadata interoperability in digital libraries: A social constructivist approach , 2012 .

[7]  Vincent Larivière,et al.  Benchmarking scientific output in the social sciences and humanities: The limits of existing databases , 2006, Scientometrics.

[8]  Mike Thelwall,et al.  Assessing the citation impact of books: The role of Google Books, Google Scholar, and Scopus , 2011, J. Assoc. Inf. Sci. Technol..

[9]  Gary Marchionini,et al.  Toward a worldwide digital library , 1998, CACM.

[10]  Alesia Zuccalá,et al.  Correlating Libcitations and Citations in the Humanities with WorldCat and Scopus Data , 2015, ISSI.

[11]  Daniel Torres-Salinas,et al.  Library Catalog Analysis as a tool in studies of social sciences and humanities: An exploratory study of published book titles in Economics , 2009, J. Informetrics.

[12]  E. Garfield Citation indexes for science. A new dimension in documentation through association of ideas. 1955. , 1955, International journal of epidemiology.

[13]  Eric Childress,et al.  Two paths to interoperable metadata , 2003 .

[14]  Edward A. Fox,et al.  The Open Archives Initiative , 2001 .

[15]  Rens Bod,et al.  Can we rank scholarly book publishers? A bibliometric experiment with the field of history , 2015, J. Assoc. Inf. Sci. Technol..

[16]  Ludo Waltman,et al.  Software survey: VOSviewer, a computer program for bibliometric mapping , 2009, Scientometrics.

[17]  Jerome McDonough,et al.  XML, Interoperability and the Social Construction of Markup Languages: The Library Example , 2009, Digit. Humanit. Q..

[18]  魏屹东,et al.  Scientometrics , 2018, Encyclopedia of Big Data.

[19]  Mike Thelwall,et al.  Google book search: Citation analysis for social science and the humanities , 2009 .

[20]  Birger Larsen,et al.  Comprehensive bibliographic coverage of the social sciences and humanities in a citation index: an empirical analysis of the potential , 2011, Scientometrics.

[21]  Rens Bod De vergeten wetenschappen: een geschiedenis van de humaniora , 2010 .

[22]  Ben R. Martin,et al.  Towards a bibliometric database for the Social Sciences and Humanities - a European scoping project , 2010 .

[23]  Wolfgang Glänzel,et al.  Opportunities for and limitations of the Book Citation Index , 2013, J. Assoc. Inf. Sci. Technol..

[24]  Zahir Tari,et al.  Advances in Object-Oriented Data Modeling , 2000 .

[25]  Fletcher T. H. Cole,et al.  Libcitations: A measure for comparative assessment of book publications in the humanities and social sciences , 2009 .

[26]  Maarten van Someren,et al.  A machine‐learning approach to coding book reviews as quality indicators: Toward a theory of megacitation , 2014, J. Assoc. Inf. Sci. Technol..

[27]  Tim C. E. Engels,et al.  The representation of the social sciences and humanities in the Web of Science—a comparison of publication patterns and incentive structures in Flanders and Norway (2005–9) , 2012 .

[28]  R. Bod A New History of the Humanities: The Search for Principles and Patterns from Antiquity to the Present , 2013 .

[29]  Björn Hammarfelt,et al.  Interdisciplinarity and the intellectual base of literature studies: citation analysis of highly cited monographs , 2011, Scientometrics.

[30]  Ali Shiri,et al.  Interoperability models in digital libraries: an overview , 2010, Electron. Libr..