Indicating Studies' Quality Based on Open Data in Digital Libraries

Researchers publish papers to report their research results and, thus, contribute to a steadily growing corpus of knowledge. To not unintentionally repeat research and studies, researchers need to be aware of the existing corpus. For this purpose, they crawl digital libraries and conduct systematic literature reviews to summarize existing knowledge. However, there are several issues concerned with such approaches: Not all documents are available to every researcher, results may not be found due to ranking algorithms, and it requires time and effort to manually assess the quality of a document. In this paper, we provide an overview of the publicly available information of different digital libraries in computer science. Based on these results, we derive a taxonomy to describe the connections between this information and discuss their suitability for quality assessments. Overall, we observe that bibliographic data and simple citation counts are available in almost all libraries, with some of them providing rather unique information. Some of this information may be used to improve automated quality assessment, but with limitations.

[1]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[2]  Jacob Krüger,et al.  (Automated) Literature Analysis - Threats and Experiences , 2018, 2018 IEEE/ACM 13th International Workshop on Software Engineering for Science (SE4Science).

[3]  Hans-Dieter Daniel,et al.  Publications as a measure of scientific advancement and of scientists' productivity , 2005, Learn. Publ..

[4]  Hans-Dieter Daniel,et al.  Data sources for performing citation analysis: an overview , 2008, J. Documentation.

[5]  Candy Schwartz,et al.  Digital libraries: an overview , 2000 .

[6]  Michael Ley,et al.  DBLP - Some Lessons Learned , 2009, Proc. VLDB Endow..

[7]  T. J. Phelan,et al.  A compendium of issues for citation analysis , 1999, Scientometrics.

[8]  Lutz Bornmann,et al.  What factors determine citation counts of publications in chemistry besides their quality? , 2012, J. Informetrics.

[9]  Wolfgang Glänzel,et al.  The influence of author self-citations on bibliometric meso-indicators. The case of european universities , 2006, Scientometrics.

[10]  Carl T. Bergstrom,et al.  The Eigenfactor™ Metrics , 2008, The Journal of Neuroscience.

[11]  Sven Hemlin,et al.  Research on research evaluation , 1996 .

[12]  C. Lee Giles The Future of CiteSeer: CiteSeerx , 2006, PKDD.

[13]  John P. A. Ioannidis,et al.  Citation of randomized evidence in support of guidelines of therapeutic and preventive interventions , 2001 .

[14]  D. Lindsey,et al.  Using citation counts as a measure of quality in science measuring what's measurable rather than what's valid , 1989, Scientometrics.

[15]  David Bawden,et al.  Is Google enough? Comparison of an internet search engine with academic library resources , 2005, Aslib Proc..

[16]  Muhammad Ali Babar,et al.  On Searching Relevant Studies in Software Engineering , 2010, EASE.

[17]  P. Jacsó As we may search : Comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases , 2005 .

[18]  Thomas W. Conkling,et al.  Google Scholar’s Coverage of the Engineering Literature: An Empirical Study , 2008 .

[19]  Stephen P. Harter Scholarly Communication and the Digital Library: Problems and Issues , 1997, J. Digit. Inf..

[20]  Gobinda G. Chowdhury,et al.  A review of the status of 20 digital libraries , 2000, J. Inf. Sci..

[21]  John P A Ioannidis,et al.  A generalized view of self-citation: direct, co-author, collaborative, and coercive induced self-citation. , 2015, Journal of psychosomatic research.

[22]  Jöran Beel,et al.  Google Scholar's ranking algorithm: The impact of citation counts (An empirical study) , 2009, 2009 Third International Conference on Research Challenges in Information Science.

[23]  Lutz Bornmann,et al.  What do we know about the h index? , 2007, J. Assoc. Inf. Sci. Technol..

[24]  A. Kulkarni,et al.  Comparisons of citations in Web of Science, Scopus, and Google Scholar for articles published in general medical journals. , 2009, JAMA.

[25]  Stevan Harnad,et al.  Open access scientometrics and the UK Research Assessment Exercise , 2007, Scientometrics.

[26]  S. Bloch,et al.  Counting on citations: a flawed way to measure quality , 2003, The Medical journal of Australia.

[27]  S. Goodman,et al.  A Systematic Examination of the Citation of Prior Research in Reports of Randomized, Controlled Trials , 2011, Annals of Internal Medicine.

[28]  Jacob Krüger,et al.  Identifying Innovative Documents: Quo vadis? , 2017, ICEIS.