Identifying web sessions with simulated annealing

Delivery of efficient service through a web site makes it compulsory in the redesigning stage to take into account the behavior of the users, which can be studied by means of a web log file that partially records information about user visits. The reconstruction of all of the sequences of pages that are visited by users who browse a web site is known as the web sessionization problem, and it has been formulated by means of an integer programming model; however, because a web log can accumulate a large amount of information, it is necessary to reconstruct the sessions over a period of weeks or months, thus the solution to this problem requires a long computational processing time. This paper presents a heuristic approach based on simulated annealing for the sessionization problem. Using this approach, it has been possible to reduce the processing time up to 166 times compared to the time that is required for the integer programming model. Furthermore, the metaheuristic solution finds new optimum values, which achieve increases on the order of 17% in the best cases.

[1]  Murat Ali Bayir,et al.  Discovering better navigation sequences for the session construction problem , 2012, Data Knowl. Eng..

[2]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[3]  V. Palade,et al.  Adaptive Web Sites - A Knowledge Extraction from Web Data Approach , 2008, Frontiers in Artificial Intelligence and Applications.

[4]  Sns Rajalakshmi,et al.  A Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites , 2012 .

[5]  I-Hsien Ting,et al.  Discovering interest groups for marketing in virtual communities: An integrated approach , 2013 .

[6]  Yibo Ren,et al.  Research on personalized recommendation based on web usage mining using collaborative filtering technique , 2009 .

[7]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[8]  Huberman,et al.  Strong regularities in world wide web surfing , 1998, Science.

[9]  Yves Crama,et al.  Local Search in Combinatorial Optimization , 2018, Artificial Neural Networks.

[10]  Geehyuk Lee,et al.  New Techniques for Data Preprocessing Based on Usage Logs for Efficient Web User Profiling at Client Side , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[11]  Siu Cheung Hui,et al.  Generation of Personalized Ontology Based on Consumer Emotion and Behavior Analysis , 2012, IEEE Transactions on Affective Computing.

[12]  Pablo E. Román,et al.  Web Usage Mining , 2010 .

[13]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[14]  S. Dreyfus,et al.  Thermodynamical Approach to the Traveling Salesman Problem : An Efficient Simulation Algorithm , 2004 .

[15]  Frans Coenen,et al.  Finding "interesting" trends in social networks using frequent pattern mining and self organizing maps , 2012, Knowl. Based Syst..

[16]  Albert-László Barabási,et al.  Modeling bursts and heavy tails in human dynamics , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  B. Suman,et al.  A survey of simulated annealing as a tool for single and multiobjective optimization , 2006, J. Oper. Res. Soc..

[18]  Shi-Jen Lin,et al.  Combining ranking concept and social network analysis to detect collusive groups in online auctions , 2012, Expert Syst. Appl..

[19]  Pablo E. Román,et al.  Web User Session Reconstruction Using Integer Programming , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[20]  Douglas C. Montgomery,et al.  Applied Statistics and Probability for Engineers, Third edition , 1994 .

[21]  Peng-Yeng Yin,et al.  Optimization of multi-criteria website structure based on enhanced tabu search and web usage mining , 2013, Appl. Math. Comput..

[22]  Pablo E. Román,et al.  Web User Session Reconstruction with Back Button Browsing , 2009, KES.

[23]  Douglas C. Montgomery,et al.  Applied Statistics and Probability for Engineers, Student Solutions Manual , 2006 .

[24]  Y. Rama Devi,et al.  Design and Implementation of Web Usage Mining Intelligent System in the Field of e-commerce , 2012 .

[25]  Tzung-Pei Hong,et al.  A practical extension of web usage mining with intentional browsing data toward usage , 2009, Expert Syst. Appl..

[26]  Myra Spiliopoulou,et al.  Measuring the Accuracy of Sessionizers for Web Usage Analysis , 2001 .

[27]  Ramana Rao,et al.  Silk from a sow's ear: extracting usable structures from the Web , 1996, CHI.

[28]  Pablo E. Román,et al.  Advanced Techniques in Web Data Pre-processing and Cleaning , 2010 .

[29]  James Miller,et al.  Empirical observations on the session timeout threshold , 2009, Inf. Process. Manag..

[30]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[31]  John C. Mitchell,et al.  Third-Party Web Tracking: Policy and Technology , 2012, 2012 IEEE Symposium on Security and Privacy.

[32]  M. F. Cardoso,et al.  Nonequilibrium simulated annealing : a faster approach to combinatorial minimization , 1994 .

[33]  V. Chitraa,et al.  A Survey on Preprocessing Methods for Web Usage Data , 2010, ArXiv.

[34]  Viljan Mahnic,et al.  Separation of Interleaved Web Sessions with Heuristic Search , 2010, 2010 IEEE International Conference on Data Mining.

[35]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[36]  Robert V. Brill,et al.  Applied Statistics and Probability for Engineers , 2004, Technometrics.

[37]  Constantino Tsallis,et al.  Optimization by Simulated Annealing: Recent Progress , 1995 .

[38]  Juan D. Velásquez,et al.  Web mining and privacy concerns: Some important legal issues to be consider before applying any data and information extraction technique in web-based environments , 2013, Expert Syst. Appl..