We introduce the Constrained Subtree Selection (CSS) problem as a model for the optimal design of websites. Given a hierarchy of topics represented as a DAG G and a probability distribution over the topics, we select a subtree of the transitive closure of G which minimizes the expected path cost. We define path cost as the sum of the page costs along a path from the root to a leaf. Page cost, γ, is a function of the number of links on a page. We give a sufficient condition for γ which makes CSS NP-Complete. This result holds even for the uniform probability distribution. We give a polynomial time algorithm for instances of CSS where G does not constrain the choice of subtrees and γ favors pages with at most k links. We show that CSS remains NP-Hard for constant degree DAGs, but also provide an O(log(k)γ(d+1)) approximation for any G with maximum degree d, provided that γ favors pages with at most k links. We also give a complete characterization of the optimal trees for two special cases: (1) linear degree cost in unconstrained graphs and uniform probability distributions, and (2) logarithmic degree cost in arbitrary DAGs and uniform probability distributions.
[1]
Günter Rote,et al.
A Dynamic Programming Algorithm for Constructing Optimal Prefix-Free Codes with Unequal Letter Costs
,
1998,
IEEE Trans. Inf. Theory.
[2]
Claire Mathieu,et al.
SC-2 00 202 Huffman Coding with Unequal Letter Costs [ Extended Abstract ]
,
2002
.
[3]
Richard M. Karp,et al.
Minimum-redundancy coding for the discrete noiseless channel
,
1961,
IRE Trans. Inf. Theory.
[4]
David S. Johnson,et al.
Computers and Intractability: A Guide to the Theory of NP-Completeness
,
1978
.
[5]
Mordecai J. Golin,et al.
Lopsided Trees, I: Analyses
,
2001,
Algorithmica.
[6]
Ronald L. Rivest,et al.
Introduction to Algorithms
,
1990
.
[7]
Andrzej Pelc,et al.
Strategies for Hotlink Assignments
,
2000,
ISAAC.
[8]
Oren Etzioni,et al.
Towards adaptive Web sites: Conceptual framework and case study
,
1999,
Artif. Intell..
[9]
Andrzej Pelc,et al.
Enhancing Hyperlink Structure for Improving Web Performance
,
2002,
J. Web Eng..