Incorporating Usage Information into Average-Clicks Algorithm

A number of methods exists that measure the distance between two web pages. Average-Clicks is a new measure of distance between web pages which fits user's intuition of distance better than the traditional measure of clicks between two pages. Average-Clicks however assumes that the probability of the user following any link on a web page is the same and gives equal weights to each of the out-going links. In our method "Usage Aware Average-Clicks" we have taken the user's browsing behavior into account and assigned different weights to different links on a particular page based on how frequently users follow a particular link. Thus, Usage Aware Average-Clicks is an extension to the Average-Clicks Algorithm where the static web link structure graph is combined with the dynamic Usage Graph (built using the information available from the web logs) to assign different weights to links on a web page and hence capture the user's intuition of distance more accurately. A new distance metric has been designed using this methodology and used to improve the efficiency of a web recommendation engine.

[1]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[2]  Allan Borodin,et al.  Finding authorities and hubs from link structures on the World Wide Web , 2001, WWW '01.

[3]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[4]  Joel C. Miller,et al.  Modifications of Kleinberg's HITS algorithm using matrix exponentiation and web log records , 2001, SIGIR '01.

[5]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[6]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[7]  Jaideep Srivastava,et al.  Incorporating Concept Hierarchies into Usage Mining Based Recommendations , 2006, WEBKDD.

[8]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[9]  Mitsuru Ishizuka,et al.  Average-Clicks: A New Measure of Distance on the World Wide Web , 2004, Journal of Intelligent Information Systems.

[10]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[11]  Myoung-Ho Kim,et al.  Information Retrieval Based on Conceptual Distance in is-a Hierarchies , 1993, J. Documentation.

[12]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[13]  Vipin Kumar,et al.  Usage Aware PageRank , 2003, WWW.

[14]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[15]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.