论文信息 - Predicting Download Directories for Web Resources

Predicting Download Directories for Web Resources

Browsing the web is one of the most common activities that users engage in nowadays, and downloading web resources of interest, such as images, documents, music, etc., is part of this process. However, users would rather temporarily save that resource to a default path that they have easy access to (e.g. their "Desktop") than select the actual directory where they would eventually place it. This clearly implies that existing user interfaces are not as effective for this particular task as the users would like them to be. Instead of proposing a different User Interface, in this paper, we try to address the problem at its core, and propose a methodology to suggest the most likely directory where the file would (eventually) be saved by the user. By doing so, future interfaces can also benefit from our technique. We provide a formal definition of the problem and propose a classification framework to tackle it. We present our overall solution to this problem, namely Directory Download PrediCtor, or DiDoCtor for short. We give experimental evidence of its effectiveness, by implementing our approach as part of a widely used browser and evaluate it with real user activity. We also discuss lessons learned from this process, regarding the efficiency perspective.

George Valkanas | Dimitrios Gunopulos

[1] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.

[2] Thomas G. Dietterich,et al. FolderPredictor: Reducing the cost of reaching the right folder , 2011, TIST.

[3] Idit Keidar,et al. Do not crawl in the dust: different urls with similar text , 2006, WWW '07.

[4] Gary L. Dannenbring. System response time and user performance , 1984, IEEE Transactions on Systems, Man, and Cybernetics.

[5] Johannes Fürnkranz,et al. Exploiting Structural Information for Text Classification on the WWW , 1999, IDA.

[6] Christopher Krügel,et al. Protecting Users against Phishing Attacks , 2006, Comput. J..

[7] Prabhakar Raghavan,et al. Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies , 1998, The VLDB Journal.

[8] Mark Dredze,et al. Automatically classifying emails into activities , 2006, IUI '06.

[9] T. W. Butler. Computer response time and user performance. , 1983, CHI '83.

[10] Jeffrey O. Kephart,et al. MailCat: an intelligent assistant for organizing e-mail , 1999, AGENTS '99.

[11] Geoff Hulten,et al. Mining high-speed data streams , 2000, KDD '00.