Knowledge Discovery in Learning Management System Using Piecewise Linear Regression

Recent developments in database technology have seen a wide variety of data being stored in huge collections. The wide variety makes the analysis tasks of a generic database a strenuous task in knowledge discovery. One approach is to summarize large datasets in such a way that the resulting summary dataset is of manageable size. Histogram has received significant attention as summarization/representative object for large database. But, it suffers from computational and space complexity. In this paper, we propose an idea to transform the histogram object into a Piecewise Linear Regression (PLR) line object and suggest that PLR objects can be less computational and storage intensive while compared to those of histograms. On the other hand to carry out a cluster analysis, we propose a distance measure for computing the distance between the PLR lines. Case study is presented based on the real data of online education system LMS. This demonstrates that PLR is a powerful knowledge representative for very large database.

[1]  L. Billard,et al.  From the Statistics of Data to the Statistics of Knowledge , 2003 .

[2]  Trevor Hastie,et al.  The Jackknife and the Bootstrap , 2016 .

[3]  M. Lesperance,et al.  PIECEWISE REGRESSION: A TOOL FOR IDENTIFYING ECOLOGICAL THRESHOLDS , 2003 .

[4]  Martin Schader,et al.  Knowledge, Data and Computer-Assisted Decisions , 1990, NATO ASI Series.

[5]  Geoffrey I. Webb,et al.  Advances in Knowledge Discovery and Data Mining , 2018, Lecture Notes in Computer Science.

[6]  Eka Miranda DATA MINING AS A TECHNIQUE TO ANALYZE THE LEARNING STYLES OF STUDENTS IN USING THE LEARNING MANAGEMENT SYSTEM , 2011 .

[7]  Edwin Diday Knowledge Representation and Symbolic Data Analysis , 1990 .

[8]  P. Nagabhushan,et al.  An Approach Based on Regression Line Features for Low Complexity Content Based Image Retrieval , 2007, 2007 International Conference on Computing: Theory and Applications (ICCTA'07).

[9]  Rafael C. González,et al.  Local Determination of a Moving Contrast Edge , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  L. Billard,et al.  Dissimilarity Measures for Histogram-valued Observations , 2013 .

[11]  H. Künsch The Jackknife and the Bootstrap for General Stationary Observations , 1989 .

[12]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[13]  Sebastián Ventura,et al.  Data mining in course management systems: Moodle case study and tutorial , 2008, Comput. Educ..

[14]  Padhraic Smyth,et al.  Knowledge Discovery and Data Mining: Towards a Unifying Framework , 1996, KDD.

[15]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[16]  G. Malash,et al.  Piecewise linear regression: A statistical method for the analysis of experimental adsorption data by the intraparticle-diffusion models , 2010 .

[17]  Nicole Lazar,et al.  The Big Picture: Symbolic Data Analysis , 2013 .

[18]  Edwin Diday,et al.  Symbolic Data Analysis: Conceptual Statistics and Data Mining (Wiley Series in Computational Statistics) , 2007 .

[19]  Francesco Palumbo,et al.  Principal component analysis of interval data: a symbolic data analysis approach , 2000, Comput. Stat..

[20]  David W. Bacon,et al.  Estimating the transition between two intersecting straight lines , 1971 .

[21]  Edwin Diday,et al.  Symbolic Data Analysis: A Mathematical Framework and Tool for Data Mining , 1999, Electron. Notes Discret. Math..

[22]  H Wainer Piecewise regression: a simplified procedure. , 1971, The British journal of mathematical and statistical psychology.

[23]  B. Lavine,et al.  Clustering and Classification of Analytical Data , 2012 .

[24]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[25]  A new linear regression model for histogram-valued variables , 2011 .

[26]  Antonio Irpino,et al.  Ordinary Least Squares for Histogram Data Based on Wasserstein Distance , 2010, COMPSTAT.

[27]  Simona Signoriello Contributions on Symbolic Data Analysis:A Model Data Approach. , 2008 .

[28]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[29]  Animesh Adhikari,et al.  Synthesizing heavy association rules from different real data sources , 2008, Pattern Recognit. Lett..