Mining user access behavior on the WWW

In this paper, an affinity-based approach that provides good similarity measures for Web document clustering to discover user access behavior on the World Wide Web (WWW) is proposed. The proposed approach generates the similarity measures for groups of Web documents by considering the user access patterns. Any clustering algorithm using better similarity measures should yield better clusters for discovering user access behavior. By utilizing the discovered user access behavior, for example, the companies can precisely target their potential customers and convince them to purchase their products or services in electronic commerce. An experiment on a real data set is conducted and the experimental result shows that the proposed approach yields a better performance than the cosine coefficient and the Euclidean distance method under the partitioning around medoid (PAM) method.

[1]  Shu-Ching Chen,et al.  Organizing a network of databases using probabilistic reasoning , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[2]  James E. Pitkow,et al.  Summary of WWW characterizations , 1998, World Wide Web.

[3]  Rangasami L. Kashyap,et al.  A probabilistic-based mechanism for video database management systems , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[4]  Jaideep Srivastava,et al.  Creating adaptive Web sites through usage-based clustering of URLs , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[5]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[6]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[7]  Oren Etzioni,et al.  Adaptive Web Sites: Automatically Synthesizing Web Pages , 1998, AAAI/IAAI.

[8]  Mei-Ling Shyu,et al.  Affinity-based probabilistic reasoning and document clustering on the WWW , 2000, Proceedings 24th Annual International Computer Software and Applications Conference. COMPSAC2000.

[9]  Chanathip Namprempre,et al.  HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering , 1996, HYPERTEXT '96.

[10]  Jaideep Srivastava,et al.  Grouping Web page references into transactions for mining World Wide Web browsing patterns , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[11]  James E. Pitkow,et al.  In Search of Reliable Usage Data on the WWW , 1997, Comput. Networks.