Predicting web user behavior using learning-based ant colony optimization

An ant colony optimization-based algorithm to predict web usage patterns is presented. Our methodology incorporates multiple data sources, such as web content and structure, as well as web usage. The model is based on a continuous learning strategy based on previous usage in which artificial ants try to fit their sessions with real usage through the modification of a text preference vector. Subsequently, trained ants are released onto a new web graph and the new artificial sessions are compared with real sessions, previously captured via web log processing. The main results of this work are related to an effective prediction of the aggregated patterns of real usage, reaching approximately 80%. In the second place, this approach allows the obtaining of a quantitative representation of the keywords that influence the navigational sessions.

[1]  Roger K. Blashfield,et al.  Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. , 1976 .

[2]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[3]  V. Palade,et al.  Adaptive Web Sites - A Knowledge Extraction from Web Data Approach , 2008, Frontiers in Artificial Intelligence and Applications.

[4]  Pablo E. Román,et al.  Stochastic Simulation of Web Users , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[5]  Lakhmi C. Jain,et al.  Advanced Techniques in Web Intelligence -1 , 2010 .

[6]  Tony White,et al.  On How Ants Put Advertisements on the Web , 2010, IEA/AIE.

[7]  V. Selvi,et al.  Comparative Analysis of Ant Colony and Particle Swarm Optimization Techniques , 2010 .

[8]  M Dorigo,et al.  Ant colonies for the travelling salesman problem. , 1997, Bio Systems.

[9]  Ajith Abraham,et al.  Web usage mining using artificial ant colony clustering and linear genetic programming , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[10]  Philip Chan,et al.  Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[11]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[12]  George Karypis,et al.  Evaluation of hierarchical clustering algorithms for document datasets , 2002, CIKM '02.

[13]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[14]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[15]  김동규,et al.  [서평]「Algorithms on Strings, Trees, and Sequences」 , 2000 .

[16]  B. K. Panigrahi,et al.  ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE , 2010 .

[17]  Terumasa Aoki,et al.  A New Similarity Measure to Understand Visitor Behavior in a Web Site , 2004, IEICE Trans. Inf. Syst..

[18]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[19]  Pankaj K. Bharne,et al.  Data clustering algorithms based on Swarm Intelligence , 2011, 2011 3rd International Conference on Electronics Computer Technology.

[20]  Djoerd Hiemstra,et al.  A probabilistic justification for using tf×idf term weighting in information retrieval , 2000, International Journal on Digital Libraries.

[21]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[22]  Debashis Ganguly,et al.  A Novel Approach for Determination of Optimal Number of Cluster , 2009, 2009 International Conference on Computer and Automation Engineering.

[23]  Chang-Chun Lin,et al.  Website reorganization using an ant colony system , 2010, Expert Syst. Appl..

[24]  Geert Wets,et al.  Mining Navigation Patterns Using a Sequence Alignment Method , 2004, Knowl. Inf. Syst..

[25]  Thomas Stützle,et al.  Ant Colony Optimization Theory , 2004 .

[26]  Sean Luke,et al.  MASON: A Multiagent Simulation Environment , 2005, Simul..