LinkSelector: A Web mining approach to hyperlink selection for Web portals

As the size and complexity of Web sites expands dramatically, it has become increasingly challenging to design Web sites where Web surfers can easily find the information they seek. In this article, we address the design of the portal page of a Web site, which serves as the homepage of a Web site or a default Web portal. We define an important research problem---hyperlink selection: selecting from a large set of hyperlinks in a given Web site, a limited number of hyperlinks for inclusion in a portal page. The objective of hyperlink selection is to maximize the efficiency, effectiveness, and usage of a Web site's portal page. We propose a heuristic approach to hyperlink selection, LinkSelector, which is based on relationships among hyperlinks---structural relationships that can be extracted from an existing Web site and access relationships that can be discovered from a Web log. We compared the performance of LinkSelector with that of the current practice of hyperlink selection (i.e., manual hyperlink selection by domain experts), using data obtained from the University of Arizona Web site. Results showed that LinkSelector outperformed the current manual selection method.

[1]  T. J. Watson,et al.  Data Mining for Path Traversal Patterns in a WebEnvironmentMing - , 1996 .

[2]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[3]  Giles,et al.  Searching the world wide Web , 1998, Science.

[4]  C. Lee Giles,et al.  Accessibility of information on the web , 1999, Nature.

[5]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[6]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Myra Spiliopoulou,et al.  Data Mining for Measuring and Improving the Success of Web Sites , 2004, Data Mining and Knowledge Discovery.

[9]  James E. Pitkow,et al.  Characterizing Browsing Behaviors on the World-Wide Web , 1995 .

[10]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[11]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[12]  Udi Manber,et al.  Experience with personalization of Yahoo! , 2000, CACM.

[13]  P. Tan,et al.  WebSIFT : The Web Site Information Filter , 1999 .

[14]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[15]  Jakob Nielsen,et al.  User interface directions for the Web , 1999, CACM.

[16]  Soumen Chakrabarti,et al.  Data mining for hypertext: a tutorial survey , 2000, SKDD.

[17]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[18]  Myra Spiliopoulou,et al.  Data Mining to Measure and Improve the Success of Web Sites , 2000, ArXiv.

[19]  Umeshwar Dayal,et al.  From User Access Patterns to Dynamic Hypertext Linking , 1996, Comput. Networks.

[20]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[21]  Dunja Mladenic,et al.  Machine learning used by Personal WebWatcher , 1999 .

[22]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[23]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[24]  Pedro M. Domingos,et al.  Adaptive Web Navigation for Wireless Devices , 2001, IJCAI.

[25]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[26]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[27]  Jon M. Kleinberg,et al.  Mining the Web's Link Structure , 1999, Computer.

[28]  Oren Etzioni,et al.  Towards adaptive Web sites: Conceptual framework and case study , 2000, Artif. Intell..

[29]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[30]  Edith Schonberg,et al.  Visualization and Analysis of Clickstream Data of Online Stores for Understanding Web Merchandising , 2004, Data Mining and Knowledge Discovery.

[31]  Philip S. Yu,et al.  Data mining for path traversal patterns in a web environment , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[32]  Thorsten Joachims,et al.  WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[33]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[34]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[35]  Yanhong Li Toward A Qualitative Search Engine , 1998, IEEE Internet Comput..

[36]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.