A bibliometric study of Video Retrieval Evaluation Benchmarking (TRECVid): A methodological analysis

This paper provides a discussion and analysis of methodological issues encountered during a scholarly impact and bibliometric study within the field of Computer Science (TRECVid Text Retrieval and Evaluation Conference, Video Retrieval Evaluation). The purpose of this paper is to provide a reflection and analysis of the methods used to provide useful information and guidance for those who may wish to undertake similar studies, and is of particular relevance for the academic disciplines which have publication and citation norms that may not perform well using traditional tools. Scopus and Google Scholar are discussed and a detailed comparison of the effects of different search methods and cleaning methods within and between these tools for subject and author analysis is provided. The additional database capabilities and usefulness of ‘Scopus More’ in addition to ‘Scopus General’ are discussed and evaluated. Scopus paper coverage is found to favourably compare with Google Scholar but Scholar consistently has superior performance at finding citations to those papers. These additional citations significantly increase the citation totals and also change the relative ranking of papers. Publish or Perish, a software wrapper for Google Scholar, is also examined and its limitations and some possible solutions are described. Data cleaning methods, including duplicate checks, expert domain checking of bibliographic data, and content checking of retrieved papers, are compared and their relative effects on paper and citation count discussed. Google Scholar and Scopus are also compared as tools for collecting bibliographic data for visualizations of developing trends and, owing to the comparative ease of collecting abstracts, Scopus is found far more effective.

[1]  Alan F. Smeaton,et al.  The scholarly impact of TRECVid (2003-2009) , 2011, J. Assoc. Inf. Sci. Technol..

[2]  Massimo Franceschet,et al.  A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar , 2010, Scientometrics.

[3]  Anne-Wil Harzing,et al.  Changing of the guard , 2009 .

[4]  Ying Ding,et al.  Popular and/or prestigious? Measures of scholarly esteem , 2010, Inf. Process. Manag..

[5]  Massimo Franceschet,et al.  The difference between popularity and prestige in the sciences and in the social sciences: A bibliometric analysis , 2010, J. Informetrics.

[6]  J. Lane Let's make science metrics more scientific , 2010, Nature.

[7]  Anthony F. J. van Raan,et al.  Performance-related differences of bibliometric statistical properties of research groups: Cumulative advantages and hierarchically layered networks , 2006 .

[8]  Lokman I. Meho,et al.  Impact of data sources on citation counts and rankings of LIS faculty: Web of science versus scopus and google scholar , 2007 .

[9]  Massimo Franceschet,et al.  The skewness of computer science , 2009, Inf. Process. Manag..

[10]  Lutz Bornmann,et al.  OPEN PEN ACCESS CCESS , 2008 .

[11]  M. Mattavelli,et al.  Introduction to the special issue on multimedia implementation », IEEE Trans. On Circuits and Systems for Video Technology , 2004 .

[12]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[13]  Robert Schroeder,et al.  Pointing Users Toward Citation Searching: Using Google Scholar and Web of Science , 2007 .

[14]  Yvonne Rogers,et al.  Citation counting, citation ranking, and h-index of human-computer interaction researchers: A comparison of Scopus and Web of Science , 2008, J. Assoc. Inf. Sci. Technol..

[15]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.

[16]  Ralph A. Szweda,et al.  Information processing management , 1972 .

[17]  Mike Thelwall,et al.  Bibliometrics to webometrics , 2008, J. Inf. Sci..

[18]  Cassidy R. Sugimoto,et al.  Visualizing changes over time: A history of information retrieval through the lens of descriptor tri-occurrence mapping , 2010, J. Inf. Sci..

[19]  Vincent Larivière,et al.  Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research , 2010, PloS one.

[20]  Endel Põder,et al.  Let's correct that small mistake , 2010, J. Assoc. Inf. Sci. Technol..

[21]  Massimo Franceschet,et al.  The role of conference publications in CS , 2010, Commun. ACM.

[22]  H. Moed Citation Analysis in Research Evaluation (Information Science & Knowledge Management) , 2005 .

[23]  Jacques Wainer,et al.  Patterns of bibliographic references in the ACM published papers , 2011, Inf. Process. Manag..

[24]  Ken Fernstrom,et al.  The Quantitative Crunch: The Impact of Bibliometric Research Quality Assessment Exercises on Academic Development at Small Conferences. , 2009 .

[25]  Alan Singleton,et al.  Bibliometrics and Citation Analysis; from the Science Citation Index to Cybermetrics , 2010, Learn. Publ..

[26]  Henk F. Moed,et al.  Citation Analysis in Research Evaluation , 1899 .

[27]  Péter Jacsó,et al.  Deflated, inflated and phantom citation counts , 2006, Online Inf. Rev..

[28]  Jiang Li,et al.  Ranking of library and information science researchers: Comparison of data sources for correlating citation data, and expert judgments , 2010, J. Informetrics.

[29]  Péter Jacsó,et al.  Google Scholar revisited , 2008, Online Inf. Rev..

[30]  Michael Levine-Clark,et al.  A comparative analysis of social sciences citation tools , 2009, Online Inf. Rev..

[31]  Hans-Dieter Daniel,et al.  Data sources for performing citation analysis: an overview , 2008, J. Documentation.

[32]  Tim Brody,et al.  Earlier Web usage statistics as predictors of later citation impact: Research Articles , 2006 .

[33]  Siu Cheung Hui,et al.  Document retrieval from a citation database using conceptual clustering and co-word analysis , 2004, Online Inf. Rev..

[34]  Judit Bar-Ilan,et al.  Web of Science with the Conference Proceedings Citation Indexes: the case of computer science , 2010, Scientometrics.

[35]  Endel Põder Let's correct that small mistake , 2010 .

[36]  Tapas Samanta,et al.  Number 23 , 2019 .

[37]  Bo Zhang,et al.  A Formal Study of Shot Boundary Detection , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Henk F. Moed,et al.  Developing Bibliometric Indicators of Research Performance in Computer Science , 2007 .

[39]  Blaise Cronin,et al.  Bibliometrics and beyond: some thoughts on web-based citation analysis , 2001, J. Inf. Sci..

[40]  Anthony F. J. van Raan Performance-related differences of bibliometric statistical properties of research groups: Cumulative advantages and hierarchically layered networks , 2006, J. Assoc. Inf. Sci. Technol..

[41]  Padraig Cunningham,et al.  Relative status of journal and conference publications in computer science , 2010, Commun. ACM.

[42]  Mike Thelwall,et al.  Using the Web for research evaluation: The Integrated Online Impact indicator , 2010, J. Informetrics.

[43]  Journal of Information Science , 1984 .

[44]  Judit Bar-Ilan,et al.  Which h-index? — A comparison of WoS, Scopus and Google Scholar , 2008, Scientometrics.

[45]  Stevan Harnad,et al.  Earlier Web Usage Statistics as Predictors of Later Citation Impact , 2005, J. Assoc. Inf. Sci. Technol..