Impact of Different Pre-Processing Tasks on Effective Identification of Users' Behavioral Patterns in Web-based Educational System

Abstract Analyzing the unique types of data that come from educational systems can help find the most effective structure of the elearning courses, optimize the learning content, recommend the most suitable learning path based on student's behavior, or provide more personalized environment. We focus only on the processes involved in the data preparation stage of web usage mining. Our objective is to specify the inevitable steps that are required for obtaining valid data from the stored logs of the webbased educational system. We compare three datasets of different quality obtained from logs of the web-based educational system and pre-processed in different ways: data with identified users’ sessions and data with the reconstructed path among course activities. We try to assess the impact of these advanced techniques of data pre-processing on the quantity and quality of the extracted rules that represent the learners’ behavioral patterns in a web-based educational system. The results confirm some initial assumptions, but they also show that the path reconstruction among visited activities in e-leaning course has not statistically significant effect on quality and quantity of the extracted rules.

[1]  Sebastián Ventura,et al.  Data mining in course management systems: Moodle case study and tutorial , 2008, Comput. Educ..

[2]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[3]  Cheng Yang,et al.  Design and Implementation of a Web Usage Mining Model Based On Upgrowth and Preflxspan , 2015, Communications of the IIMA.

[4]  Elena Gaudioso,et al.  Data mining to support tutoring in virtual learning communities: experiences and challenges , 2005 .

[5]  Ilias Petrounias,et al.  A Framework for Using Web Usage Mining to Personalise E-learning , 2007, Seventh IEEE International Conference on Advanced Learning Technologies (ICALT 2007).

[6]  Yan Li,et al.  The Construction of Transactions for Web Usage Mining , 2009, 2009 International Conference on Computational Intelligence and Natural Computing.

[7]  Murat Ali Bayir,et al.  A New Approach for Reactive Web Usage Data Processing , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[8]  Kenneth A. Ross,et al.  Proceedings of the 2009 ACM SIGMOD International Conference on Management of data , 2013, SIGMOD 2013.

[9]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[10]  G T Raju,et al.  Knowledge Discovery from Web Usage Data: Complete Preprocessing Methodology , 2008 .

[11]  Pasi Fränti,et al.  Web Data Mining , 2009, Encyclopedia of Database Systems.

[12]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[13]  Laks V. S. Lakshmanan,et al.  Scalable frequent-pattern mining methods: an overview , 2001, KDD Tutorials.

[14]  Carlos Delgado Kloos,et al.  Web Usage Mining in a Blended Learning Context: A Case Study , 2008, 2008 Eighth IEEE International Conference on Advanced Learning Technologies.

[15]  Sebastián Ventura,et al.  Web usage mining for predicting final marks of students that use Moodle courses , 2013, Comput. Appl. Eng. Educ..

[16]  Duncan Dubugras Alcoba Ruiz,et al.  A pre-processing tool for Web usage mining in the distance education domain , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[17]  Sebastián Ventura,et al.  Data Mining in E-learning , 2006 .

[18]  Peter Svec,et al.  Data preprocessing evaluation for web log mining: reconstruction of activities of a web visitor , 2010, ICCS.

[19]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[20]  Yan Li,et al.  Research on Path Completion Technique in Web Usage Mining , 2008, 2008 International Symposium on Computer Science and Computational Technology.

[21]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[22]  A. K. Pujari,et al.  Data Mining Techniques , 2006 .

[23]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[24]  ryan. yang,et al.  Design and Implementation of a Web Usage Mining Model Based On Fpgrowth and Prefixspan , 2006 .

[25]  Sushil Jajodia,et al.  Proceedings of the 1993 ACM SIGMOD international conference on Management of data , 1993, SIGMOD 1993.

[26]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[27]  V. Chitraa,et al.  A Survey on Preprocessing Methods for Web Usage Data , 2010, ArXiv.

[28]  Zhang Huiying,et al.  An intelligent algorithm of data pre-processing in Web usage mining , 2004, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788).