论文信息 - Self-adaptive user profiles for large-scale data delivery

Self-adaptive user profiles for large-scale data delivery

Push based data delivery requires knowledge of user interests for making scheduling, bandwidth allocation, and routing decisions. Such information is maintained as user profiles. We propose a novel incremental algorithm for constructing user profiles based on monitoring and user feedback. In contrast to earlier approaches, which typically represent profiles as a single weighted interest vector, we represent user profiles as multiple interest vectors, whose number, size, and elements change adaptively based on user access behavior. This flexible approach allows the profile to more accurately represent complex user interests. Although there has been significant research on user profiles, our approach is unique in that it can be tuned to trade-off profile complexity and quality. This feature, together with its incremental nature, makes our method suitable for use in large scale information filtering applications such as push based WWW page dissemination. We evaluate the method by experimentally investigating its ability to categorize WWW pages taken from Yahoo! categories. Our results show that the method can provide high filtering effectiveness with modest profile sizes and can effectively adapt to changes in users' interests.

C. Lee Giles | Ugur Çetintemel | Michael J. Franklin

[1] Nicholas J. Belkin,et al. Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[2] Calton Pu,et al. CQ: a personalized update monitoring toolkit , 1998, SIGMOD '98.

[3] Gerald Salton,et al. Automatic text processing , 1988 .

[4] Donna K. Harman,et al. Overview of the Fifth Text REtrieval Conference (TREC-5) , 1996, TREC.

[5] James P. Callan,et al. Document filtering with inference networks , 1996, SIGIR '96.

[6] Tian Zhang,et al. BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[7] David J. Harper,et al. The WebCluster project. Using clustering for mediating access to the World Wide Web , 1998, SIGIR '98.

[8] Susan T. Dumais,et al. Personalized information delivery: an analysis of information filtering methods , 1992, CACM.

[9] Gerard Salton,et al. Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[10] Howard R. Turtle. Natural language vs. Boolean query evaluation: a comparison of retrieval performance , 1994, SIGIR '94.

[11] James Allan,et al. Incremental relevance feedback for information filtering , 1996, SIGIR '96.