Title Automatic construction of online catalog topologies

Given a set of products, where each is characterized by a set of attribute values, an online catalog is an organization of a set of product pages on the web through which users can access their required product information. A good online catalog is crucial to the success of an e-commerce web site. Traditionally, an online catalog is mainly built by hand. To what extent this can be automated is a challenging problem. Recently, there have been investigations on how to reorganize an existing online catalog based on some criteria, but none of them has addressed the problem of organizing an online catalog automatically from scratch. This paper attempts to tackle this problem. We model an online catalog organization as a decision tree structure and propose a metric, based on thepopularityof products and the relative importanceof product attribute values, to evaluate the quality of a catalog organization. The problem is then formulated as a decision tree construction problem. Although traditional decision tree algorithms, such as C4.5, can be used to generate online catalog organization, the catalog constructed is generally not good based on our metric. An efficient greedy algorithm (GENCAT) is thus developed, and the experimental results show that GENCAT produces better catalog organizations based on our metric.