Automatic construction of online catalog topologies

A good online catalog is crucial to the success of an e-commerce web site. Traditionally, an online catalog is mainly built by hand. To what extent this can be automated is a challenging problem. Recently, there have been investigations on how to reorganize an existing online catalog based on some criteria, but none of them has addressed the problem of organizing an online catalog automatically from scratch. This paper attempts to tackle this problem. We model an online catalog organization as a decision tree structure and propose a metric, based on the popularity of products and the relative importance of product attribute values, to evaluate the quality of a catalog organization. The problem is then formulated as a decision tree construction problem. Although traditional decision tree algorithms, such as C4.5, can be used to generate online catalog organization, the catalog constructed is generally not good based on our metric. An efficient greedy algorithm (GENCAT) is thus developed, and the experimental results show that GENCAT produces better catalog organizations based on our metric.

[1]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[2]  Juhnyoung Lee,et al.  Analytical Product Selection Using a Highly-Dense Interface for Online Product Catalogs , 2001 .

[3]  Christian Plaunt,et al.  The online catalog: from technical services to access service , 1993 .

[4]  Dan Suciu,et al.  STRUDEL: a Web site management system , 1997, SIGMOD '97.

[5]  Mikkel Thorup,et al.  Optimal evolutionary tree comparison by sparse dynamic programming , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[6]  Louis B. Rosenfeld,et al.  Information architecture for the world wide web - designing large-scale web sites , 1998 .

[7]  Oren Etzioni,et al.  Adaptive Web Sites: Automatically Synthesizing Web Pages , 1998, AAAI/IAAI.

[8]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[9]  Paolo Merialdo,et al.  The Araneus Web-based management system , 1998, SIGMOD '98.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Christian Plaunt,et al.  OASIS: a front-end for prototyping catalog enhancements , 1992 .

[12]  J. Guilford Psychometric methods, 2nd ed. , 1954 .

[13]  Arnaud Sahuguet,et al.  Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F , 1999, VLDB.

[14]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[15]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[16]  Markus Stolze,et al.  Soft Navigation in Product Catalogs , 1998, ECDL.

[17]  Stephen J. Green,et al.  Automated Link Generation: Can we do Better than Term Repetition? , 1998, Comput. Networks.

[18]  Thomas Berlage,et al.  FOCUS: the interactive table for product comparison and selection , 1996, UIST '96.

[19]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[20]  John D. Garofalakis,et al.  Web Site Optimization Using Page Popularity , 1999, IEEE Internet Comput..