Multiobjective evolutionary clustering of Web user sessions: a case study in Web page recommendation

In this study, we experiment with several multiobjective evolutionary algorithms to determine a suitable approach for clustering Web user sessions, which consist of sequences of Web pages visited by the users. Our experimental results show that the multiobjective evolutionary algorithm-based approaches are successful for sequence clustering. We look at a commonly used cluster validity index to verify our findings. The results for this index indicate that the clustering solutions are of high quality. As a case study, the obtained clusters are then used in a Web recommender system for representing usage patterns. As a result of the experiments, we see that these approaches can successfully be applied for generating clustering solutions that lead to a high recommendation accuracy in the recommender model we used in this paper.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Tao Luo,et al.  Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization , 2004, Data Mining and Knowledge Discovery.

[3]  Francisco Herrera,et al.  A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 Special Session on Real Parameter Optimization , 2009, J. Heuristics.

[4]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Multi-Objective Clustering Ensemble , 2006, 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS'06).

[5]  M. Tamer Özsu,et al.  A Web page prediction model based on click-stream tree representation of user behavior , 2003, KDD '03.

[6]  Andreas Zell,et al.  Biological Cluster Validity Indices Based on the Gene Ontology , 2005, IDA.

[7]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[8]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[9]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[10]  Jason Lee,et al.  BAG: a graph theoretic sequence clustering algorithm , 2006, Int. J. Data Min. Bioinform..

[11]  Edward A. Fox,et al.  Recommender Systems Research: A Connection-Centric Survey , 2004, Journal of Intelligent Information Systems.

[12]  Anil K. Jain,et al.  Multiobjective data clustering , 2004, CVPR 2004.

[13]  Rowena Cole,et al.  Clustering with genetic algorithms , 1998 .

[14]  Bamshad Mobasher,et al.  Discovery of Aggregate Usage Profiles for Web Personalization , 2000 .

[15]  Nikos Manouselis,et al.  Analysis and Classification of Multi-Criteria Recommender Systems , 2007, World Wide Web.

[16]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[17]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[18]  Murat Göksedef,et al.  A Consensus Recommender for Web Users , 2007, ADMA.

[19]  Lothar Thiele,et al.  A Tutorial on the Performance Assessment of Stochastic Multiobjective Optimizers , 2006 .

[20]  D. Whitefield,et al.  A review of: “Practical Nonpararnetric Statistics. By W. J. CONOVER. (New York: Wiley, 1971.) [Pl" x+462.] £5·25. , 1972 .

[21]  D. Szafron,et al.  Sequence Alignment using FastLSA , 2000 .

[22]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[23]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[24]  Yimin Liu,et al.  Multi-objective Genetic Algorithm Based Clustering Approach and Its Application to Gene Expression Data , 2004, ADVIS.

[25]  M. Tamer Özsu,et al.  Incremental click-stream tree model: Learning from new users for web page prediction , 2006, Distributed and Parallel Databases.

[26]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Marco Laumanns,et al.  SPEA2: Improving the Strength Pareto Evolutionary Algorithm For Multiobjective Optimization , 2002 .

[28]  Paul P. Wang,et al.  Computational Biology and Genome Informatics , 2003 .

[29]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[30]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[31]  David E. Goldberg,et al.  A niched Pareto genetic algorithm for multiobjective optimization , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[32]  Kalyanmoy Deb,et al.  Finding Knees in Multi-objective Optimization , 2004, PPSN.

[33]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[34]  Jun Du,et al.  Novel Clustering That Employs Genetic Algorithm with New Representation Scheme and Multiple Objectives , 2004, DaWaK.

[35]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[36]  Horst Bunke,et al.  Validation indices for graph clustering , 2003, Pattern Recognit. Lett..

[37]  S.G. Oguducu,et al.  A new graph-based evolutionary approach to sequence clustering , 2005, Fourth International Conference on Machine Learning and Applications (ICMLA'05).

[38]  Olfa Nasraoui,et al.  Web Usage Mining , 2011 .

[39]  Marco Laumanns,et al.  PISA: A Platform and Programming Language Independent Interface for Search Algorithms , 2003, EMO.

[40]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  A. Sima Etaner-Uyar,et al.  Effects of Session Representation Models on the Performance of Web Recommender Systems , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[42]  Emin Erkan Korkmaz,et al.  A Two-Level Clustering Method Using Linear Linkage Encoding , 2006, PPSN.

[43]  A. Sima Etaner-Uyar,et al.  Graph-based sequence clustering through multiobjective evolutionary algorithms for web recommender systems , 2007, GECCO '07.

[44]  Chung-Kuan Cheng,et al.  An improved two-way partitioning algorithm with stable performance [VLSI] , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[45]  Umeshwar Dayal,et al.  From User Access Patterns to Dynamic Hypertext Linking , 1996, Comput. Networks.

[46]  Martin J. Oates,et al.  PESA-II: region-based selection in evolutionary multiobjective optimization , 2001 .

[47]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[48]  Kenneth H. Rosen,et al.  Discrete Mathematics and its applications , 2000 .

[49]  Ron Shamir,et al.  A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..

[50]  Joshua D. Knowles,et al.  Multiobjective clustering around medoids , 2005, 2005 IEEE Congress on Evolutionary Computation.

[51]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .