论文信息 - Machine learning used by Personal WebWatcher

Machine learning used by Personal WebWatcher

This paper describes design of personal browsing assistant Personal WebWatcher that suggests interesting hyperlinks on the requested Web documents. Machine learning is used to generate a model of user's interests. We consider two approaches that di er in the information included in training examples: (1) include information presented to the user, that is a part of the text from the document that contains a hyperlink and (2) include information that was not presented to the user, that is the content of the document pointed to by a hyperlink. We compare two classication algorithms k-Nearest Neighbor and Naive Bayes. Bag of words document representation is used and features are selected using Information gain. Preliminary experiments show that there is no signi cant difference between the used classi ers and that using only a small number of features gives almost the same results as using all features. In all experiments the achieved classi cation accuracy is the same or slightly higher than the default accuracy. Since the default accuracy is higher for approach (1) than for approach (2), the results of approach (1) show higher classi cation accuracy.

Dunja Mladenic | Jozef Stefan | D. Mladenic | Jožef Stefan

[1] Bruce Krulwich,et al. The ContactFinder Agent: Answering Bulletin Board Questions with Referrals , 1996, AAAI/IAAI, Vol. 1.

[2] Thorsten Joachims,et al. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[3] Yoav Shoham,et al. Fab: content-based, collaborative recommendation , 1997, CACM.

[4] Dan Ionescu,et al. A Learning Agent that Assists the Browsing of Software Libraries , 2000, IEEE Trans. Software Eng..

[5] Michael J. Pazzani,et al. & Webert: Identifying i teresting web sites , 1996 .

[6] Yiming Yang,et al. An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[7] Kristian J. Hammond,et al. A Case-Based Approach to Knowledge Navigation , 1994, IJCAI.

[8] Thorsten Joachims,et al. WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[9] Dunja Mladenic,et al. Turning {{\sc Yahoo!}}\ into an automatic Web page classifier , 1998 .

[10] Thorsten Joachims,et al. WebWatcher: Machine Learning and Hypertext , 1995 .

[11] J. R. Quinlan. Constructing Decision Trees , 1993 .

[12] Robert C. Holte,et al. A Learning Apprentice For Browsing , 1994 .

[13] Pattie Maes,et al. Agents that reduce work and information overload , 1994, CACM.

[14] Tom M. Mitchell,et al. Experience with a learning personal assistant , 1994, CACM.

[15] Robert E. Kraut,et al. The HomeNet field trial of residential Internet services , 1996, CACM.