Influence of Different Session Timeouts Thresholds on Results of Sequence Rule Analysis in Educational Data Mining

The purpose of using web usage mining methods in the area of learning management systems is to reveal the knowledge hidden in the log files of their web and database servers. By applying data mining methods to these data, interesting patterns concerning the users’ behaviour can be identified. They help us to find the most effective structure of the e-learning courses, optimize the learning content, recommend the most suitable learning path based on student’s behaviour, or provide more personalized environment. We prepare six datasets of different quality obtained from logs of the learning management system and pre-processed in different ways. We use three datasets with identified users’ sessions based on 15, 30 and 60 minute session timeout threshold and three another datasets with the same thresholds including reconstructed paths among course activities. We try to assess the impact of different session timeout thresholds with or without paths completion on the quantity and quality of the sequence rule analysis that contribute to the representation of the learners’ behavioural patterns in learning management system. The results show that the session timeout threshold has significant impact on quality and quantity of extracted sequence rules. On the contrary, it is shown that the completion of paths has neither significant impact on quantity nor quality of extracted rules.

[1]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[2]  Yan Li,et al.  The Construction of Transactions for Web Usage Mining , 2009, 2009 International Conference on Computational Intelligence and Natural Computing.

[3]  Ilias Petrounias,et al.  A Framework for Using Web Usage Mining to Personalise E-learning , 2007, Seventh IEEE International Conference on Advanced Learning Technologies (ICALT 2007).

[4]  Rajesh Parekh,et al.  Lessons and Challenges from Mining Retail E-Commerce Data , 2004, Machine Learning.

[5]  Sally Jo Cunningham,et al.  A Comparative Transaction Log Analysis of Two Computing Collections , 2000, ECDL.

[6]  Duncan Dubugras Alcoba Ruiz,et al.  A pre-processing tool for Web usage mining in the distance education domain , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[7]  Sushil Jajodia,et al.  Proceedings of the 1993 ACM SIGMOD international conference on Management of data , 1993, SIGMOD 1993.

[8]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[9]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10]  Stuart Macdonald,et al.  User Engagement in Research Data Curation , 2009, ECDL.

[11]  Sebastián Ventura,et al.  Data mining in course management systems: Moodle case study and tutorial , 2008, Comput. Educ..

[12]  Peter Svec,et al.  Data advance preparation factors affecting results of sequence rule analysis in web log mining , 2010 .

[13]  Peter Svec,et al.  Data preprocessing evaluation for web log mining: reconstruction of activities of a web visitor , 2010, ICCS.

[14]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[15]  Laks V. S. Lakshmanan,et al.  Scalable frequent-pattern mining methods: an overview , 2001, KDD Tutorials.

[16]  Carlos Delgado Kloos,et al.  Web Usage Mining in a Blended Learning Context: A Case Study , 2008, 2008 Eighth IEEE International Conference on Advanced Learning Technologies.

[17]  Sebastián Ventura,et al.  Web usage mining for predicting final marks of students that use Moodle courses , 2013, Comput. Appl. Eng. Educ..

[18]  Dale Schuurmans,et al.  Dynamic Web log session identification with statistical language models , 2004, J. Assoc. Inf. Sci. Technol..

[19]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[20]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[21]  Zhixiang Chen,et al.  Linear time algorithms for finding maximal forward references , 2003, Proceedings ITCC 2003. International Conference on Information Technology: Coding and Computing.

[22]  G T Raju,et al.  Knowledge Discovery from Web Usage Data: Complete Preprocessing Methodology , 2008 .

[23]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[24]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[25]  M. HamidR.Jamali,et al.  Website usage metrics: A re-assessment of session data , 2008, Inf. Process. Manag..

[26]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[27]  Zhao Li,et al.  Evaluating Web software reliability based on workload and failure data extracted from server logs , 2004, IEEE Transactions on Software Engineering.

[28]  James Miller,et al.  Empirical observations on the session timeout threshold , 2009, Inf. Process. Manag..

[29]  Murat Ali Bayir,et al.  A New Approach for Reactive Web Usage Data Processing , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[30]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[31]  V. Chitraa,et al.  A Survey on Preprocessing Methods for Web Usage Data , 2010, ArXiv.

[32]  Zhang Huiying,et al.  An intelligent algorithm of data pre-processing in Web usage mining , 2004, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788).

[33]  Toan Nguyen Duc Huynh Empirically Driven Investigation of Dependability and Security Issues in Internet-Centric Systems , 2010 .

[34]  Yan Li,et al.  Research on Path Completion Technique in Web Usage Mining , 2008, 2008 International Symposium on Computer Science and Computational Technology.

[35]  Jafar Habibi,et al.  Using Educational Data Mining Methods to Study the Impact of Virtual Classroom in E-Learning , 2010, EDM.

[36]  Filippo Menczer,et al.  What's in a session: tracking individual behavior on the web , 2009, HT '09.

[37]  Ciro Cattuto,et al.  Proceedings of the 20th ACM conference on Hypertext and hypermedia , 2009 .

[38]  Katerina Goseva-Popstojanova,et al.  Empirical study of session-based workload and reliability for Web servers , 2004, 15th International Symposium on Software Reliability Engineering.