A New Frontier of Informetric and Webometric Research: Mining Web Usage Data

Many Webometric research have been conducted since the early days of the Web. Most have focused on Web hyperlink as data source. Very few used Web usage data, a very rich data source that can be analyzed for various purposes. The objective of this paper is to encourage more research into this new frontier and to advance informetric and Webometric research with new Web data sources. I will first introduce the concept of Web data mining and discuss how Web data mining relates to traditional informetric research. I will then discuss types of Web usage data that are available and provide examples of studies that used these types of data. I will also discuss the limitations of Web usage data.

[1]  Liwen Vaughan,et al.  Can electronic journal usage data replace citation data as a measure of journal use? An empirical examination , 2006 .

[2]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[3]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[4]  Bhavani M. Thuraisingham,et al.  Web Data Mining and Applications in Business Intelligence and Counter-Terrorism , 2003 .

[5]  Pasi Fränti,et al.  Web Data Mining , 2009, Encyclopedia of Database Systems.

[6]  M. HamidR.Jamali,et al.  Site navigation and its impact on the content viewed by the virtual scholar: a deep log analysis , 2007, J. Inf. Sci..

[7]  Quentin L. Burrell,et al.  A Model for Library Book Circulations Incorporating Loan Periods , 1994, J. Am. Soc. Inf. Sci..

[8]  Ronald Rousseau,et al.  Spectral Methods for Detecting Periodicity in Library Circulation Data: A Case Study , 1997, Inf. Process. Manag..

[9]  Johannes Fürnkranz,et al.  Web Mining , 2005, Data Mining and Knowledge Discovery Handbook.

[10]  Mike Thelwall,et al.  Web Impact Factors for Australasian universities , 2002, Scientometrics.

[11]  Jean Tague-Sutcliffe,et al.  The Markov and the mixed-Poisson Models of Library Circulation compared , 1987, J. Documentation.

[12]  Johan Bollen,et al.  Towards usage-based impact metrics: first results from the mesur project. , 2008, JCDL '08.

[13]  Liwen Vaughan,et al.  Usage Data for Electronic Resources: A Comparison between Locally Collected and Vendor-Provided Statistics. , 2003 .

[14]  Dietmar Wolfram,et al.  End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine , 2000, J. Am. Soc. Inf. Sci..

[15]  Peter Ingwersen,et al.  The calculation of web impact factors , 1998, J. Documentation.

[16]  Philip S. Yu,et al.  Discovering Business Intelligence Information by Comparing Company Web Sites , 2003 .

[17]  Liwen Vaughan,et al.  Content assisted web co-link analysis for competitive intelligence , 2008, Scientometrics.

[18]  Amanda Spink,et al.  Use of query reformulation and relevance feedback by Excite users , 2000, Internet Res..

[19]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[20]  Mike Thelwall,et al.  A modeling approach to uncover hyperlink patterns: the case of Canadian universities , 2005, Inf. Process. Manag..

[21]  David Nicholas,et al.  Evaluating consumer website logs: a case study of The Times/The Sunday Times website , 2000, J. Inf. Sci..

[22]  Philipp Mayr,et al.  Constructing experimental indicators for Open Access documents , 2006, ArXiv.

[23]  Jaideep Srivastava,et al.  Web Mining , 2004, Data Mining and Knowledge Discovery.

[24]  Mike Thelwall,et al.  Extracting macroscopic information from Web links , 2001, J. Assoc. Inf. Sci. Technol..