Modeling journal bibliometrics to predict downloads and inform purchase decisions at university research libraries

University libraries provide access to thousands of online journals and other content, spending millions of dollars annually on these electronic resources. Providing access to these online resources is costly, and it is difficult both to analyze the value of this content to the institution and to discern those journals that comparatively provide more value. In this research, we examine 1,510 journals from a large research university library, representing more than 40% of the university's annual subscription cost for electronic resources at the time of the study. We utilize a web analytics approach for the creation of a linear regression model to predict usage among these journals. We categorize metrics into two classes: global (journal focused) and local (institution dependent). Using 275 journals for our training set, our analysis shows that a combination of global and local metrics creates the strongest model for predicting full‐text downloads. Our linear regression model has an accuracy of more than 80% in predicting downloads for the 1,235 journals in our test set. The implications of the findings are that university libraries that use local metrics have better insight into the value of a journal and therefore more efficient cost content management.

[1]  Ismael Rafols,et al.  How journal rankings can suppress interdisciplinary research: A comparison between Innovation Stud , 2012 .

[2]  Judith A. Wolfe Usage Statistics of E-Serials , 2008 .

[3]  Bernard J. Jansen,et al.  The seventeen theoretical constructs of information searching and information retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[4]  Ismael Rafols,et al.  How journal rankings can suppress interdisciplinarity. The case of innovation studies in business and management , 2011, ArXiv.

[5]  David J. Olive,et al.  Introduction to Regression Analysis , 2007 .

[6]  Lars Leon,et al.  Looking at Resource Sharing Costs , 2012 .

[7]  Amos Lakos,et al.  Evidence-Based Library Management: The Leadership Challenge , 2007 .

[8]  Stephen J. Bensman The impact factor: its place in Garfield’s thought, in science evaluation, and in library collection management , 2011, Scientometrics.

[9]  Diane Carroll Procedures for creating a Serials Decision Database , 2009 .

[10]  Emily Miller-Francisco Managing electronic resources in a time of shrinking budgets , 2003 .

[11]  Bernard J. Jansen,et al.  Measuring the value of library content collections , 2013, ASIST.

[12]  Bernard J. Jansen,et al.  Classifying web search queries to identify high revenue generating customers , 2012, J. Assoc. Inf. Sci. Technol..

[13]  Joel Cummings,et al.  Data Driven Collection Assessment using a Serial Decision Database , 2010 .

[14]  Bernard J. Jansen,et al.  Understanding User-Web Interactions via Web Analytics , 2009, Understanding User-Web Interactions via Web Analytics.

[15]  Horst Rinne History and meaning of the , 2008 .

[16]  Paul Metz,et al.  Building a comprehensive serials decision database at Virginia Tech , 2000 .

[17]  Robert A. Bartsch,et al.  Student perceptions (and the reality) of percentage of journal articles found through full-text databases , 2003 .

[18]  Douglas N. Arnold,et al.  Nefarious Numbers , 2010, ArXiv.

[19]  Stephen J. Bensman The Structure of the Library Market for Scientific Journals: The Case of Chemistry , 1996 .

[20]  Oliver Pesch Usage Factor for Journals: A New Measure for Scholarly Impact , 2012 .

[21]  E. Garfield The history and meaning of the journal impact factor. , 2006, JAMA.

[22]  Meredith Ringel Morris,et al.  Collaborative Web Search: Who, What, Where, When, and Why , 2009, Collaborative Web Search: Who, What, Where, When, and Why.

[23]  Yavuz Akbulut,et al.  Predictors of inconsistent responding in web surveys , 2015, Internet Res..

[24]  Kathleen Bauer,et al.  Evidence-based librarianship: Utilizing data from all available sources to make judicious print cancellation decisions , 2005 .

[25]  Alfred E. Hartemink,et al.  Soil science and the h index , 2007, Scientometrics.

[26]  David C. Fowler Usage Statistics of E-serials , 2007 .

[27]  Paul Metz Thirteen Steps to Avoiding Bad Luck in a Serials Cancellation Project. , 1992 .

[28]  J. Fenton,et al.  Alternatives to the impact factor. , 2014, The surgeon : journal of the Royal Colleges of Surgeons of Edinburgh and Ireland.

[29]  J. Lane Let's make science metrics more scientific , 2010, Nature.