Visualization of Data Mining Rules using OLAP

Data Mining is an emerging knowledge discovery process of extracting pre viously unknown, actionable information from very large scientific and commercial databases. Usually, a data mining process extracts rules by processing high dimensional categorical and/or numerical data. However, in the data mining context the user often has to analyze hundreds of extracted rules in order to grasp valuable knowledge. Thus, the analysis of such rules by means of visual tools has evolved rapidly in recent years. Visual data mining attempts to take advantage of humans' ability to perceive pattern and structure in visual form. Researchers have developed many tools to visualize data mining rules. However, few of these tools can handle effectively more than a few dozens of data mining rules. In this paper, we propose a new visualization technique of data mining rules based on OLAP. More specifically, the proposed technique utilizes the standard two-dimensional cross-tabulation table of most OLAP models in order to visualize even a great number of data mining rules. The advantage of the proposed technique is that the user can choose to "drill down" on specific subsets of such rules. We also present experimental results that demonstrate how the proposed technique is useful and helpful for analyzing and understanding extracted data mining rules.

[1]  Dario Bruzzese,et al.  Visual Mining of Association Rules , 2008, Visual Data Mining.

[2]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[3]  Hing-Yan Lee,et al.  Visualization Support for Data Mining , 1996, IEEE Expert.

[4]  Michael N. Vrahatis,et al.  Artificial nonmonotonic neural networks , 2001, Artif. Intell..

[5]  Heike Hofmann Constructing and reading mosaicplots , 2003, Comput. Stat. Data Anal..

[6]  Basilis Boutsinas,et al.  Automatic Interactive Music Improvization Based on Data Mining , 2012, Int. J. Artif. Intell. Tools.

[7]  E. F. Codd,et al.  Providing OLAP to User-Analysts: An IT Mandate , 1998 .

[8]  Basilis Boutsinas On defining OLAP formulations , 2005 .

[9]  Christopher R. Westphal,et al.  Data Mining Solutions: Methods and Tools for Solving Real-World Problems , 1998 .

[10]  Anil K. Jain,et al.  Clustering Methodologies in Exploratory Data Analysis , 1980, Adv. Comput..

[11]  Heike Hofmann,et al.  Visualizing association rules with interactive mosaic plots , 2000, KDD '00.

[12]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[13]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[14]  Hans-Peter Kriegel,et al.  Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.

[15]  Ayesha Afsana Information and communication technologies in education , 2013 .

[16]  Nick Cercone,et al.  CViz: An Interactive Visualization System for Rule Induction , 2000, Canadian Conference on AI.

[17]  Timos K. Sellis,et al.  A survey of logical models for OLAP databases , 1999, SGMD.

[18]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[19]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[20]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[21]  Ron Kohavi,et al.  MineSet: An Integrated System for Data Mining , 1997, KDD.

[22]  Heikki Mannila,et al.  Verkamo: Fast Discovery of Association Rules , 1996, KDD 1996.

[23]  Pak Chung Wong,et al.  Visualizing association rules for text mining , 1999, Proceedings 1999 IEEE Symposium on Information Visualization (InfoVis'99).

[24]  Ioannis Kopanakis,et al.  Visual data mining modeling techniques for the visualization of mining outcomes , 2003, J. Vis. Lang. Comput..

[25]  Basilis Boutsinas,et al.  Accessing Data Mining Rules through Expert Systems , 2002, Int. J. Inf. Technol. Decis. Mak..

[26]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[27]  Nick Cercone,et al.  RuleViz: a model for visualizing knowledge discovery process , 2000, KDD '00.

[28]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[29]  Bruce Hendrickson,et al.  Knowledge Mining With VxInsight: Discovery Through Interaction , 1998, Journal of Intelligent Information Systems.

[30]  John F. Roddick,et al.  Visualisation of Temporal Interval Association Rules , 2000, IDEAL.

[31]  Matthew O. Ward,et al.  InterRing: an interactive tool for visually navigating and manipulating hierarchical structures , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[32]  Markus Grünwald,et al.  Business Intelligence , 2009, Informatik-Spektrum.

[33]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[34]  Hing-Yan Lee,et al.  Exploiting Visualization in Knowledge Discovery , 1995, KDD.

[35]  Vítor Santos Costa,et al.  Inductive Logic Programming , 2013, Lecture Notes in Computer Science.

[36]  Ron Kohavi,et al.  Targeting Business Users with Decision Table Classifiers , 1998, KDD.

[37]  Allen Silver,et al.  Beta , 1975, The SAGE Encyclopedia of Research Design.

[38]  Alfred Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[39]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[40]  Jan Komorowski,et al.  Visualization of Rules in Rule-Based Classifiers , 2012 .

[41]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[42]  B. Boutsinas,et al.  Estimating the number of clusters using a windowing technique , 2006, Pattern Recognition and Image Analysis.

[43]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[44]  Dario Bruzzese,et al.  Combining visual techniques for Association Rules exploration , 2004, AVI.

[45]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[46]  M. Aldenderfer,et al.  Cluster Analysis. Sage University Paper Series On Quantitative Applications in the Social Sciences 07-044 , 1984 .

[47]  Barry G. Becker Visualizing decision table classifiers , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[48]  Kwan-Liu Ma,et al.  RINGS: A Technique for Visualizing Large Hierarchies , 2002, Graph Drawing.

[49]  Ee-Peng Lim,et al.  CrystalClear: Active visualization of association rules , 2002 .

[50]  Ben Shneiderman,et al.  Tree-maps: a space-filling approach to the visualization of hierarchical information structures , 1991, Proceeding Visualization '91.

[51]  M. Aldenderfer Cluster Analysis , 1984 .

[52]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[53]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[54]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[55]  J. Ross Quinlan,et al.  Generating Production Rules from Decision Trees , 1987, IJCAI.

[56]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[57]  John F. Roddick,et al.  Visualising hierarchical associations , 2003, Knowledge and Information Systems.

[58]  Krysia Broda,et al.  Symbolic knowledge extraction from trained neural networks: A sound approach , 2001, Artif. Intell..

[59]  Graham K. Rand,et al.  Quantitative Applications in the Social Sciences , 1983 .

[60]  Inderjit S. Dhillon,et al.  Visualizing Class Structure of Multidimensional Data , 1998 .

[61]  Ben Shneiderman,et al.  Understanding Hierarchical Clustering Results by Interactive Exploration of Dendrograms: A Case Study with Genomic Microarray Data , 2003 .

[62]  H. Kriegel Cooperative Classification : A Visualization-Based Approach of Combining the Strengths of the User and the Computer Contact Information , 2001 .

[63]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[64]  Hing-Yan Lee,et al.  Visual Data Exploration Using WinViz , 1995 .

[65]  Matthew O. Ward,et al.  Hierarchical parallel coordinates for exploration of large datasets , 1999, Proceedings Visualization '99 (Cat. No.99CB37067).

[66]  Padraic Neville,et al.  A comparison of 2-D visualizations of hierarchies , 2001, IEEE Symposium on Information Visualization, 2001. INFOVIS 2001..

[67]  Ron Kohavi,et al.  Visualizing the Simple Bayesian Classi er , 1997 .

[68]  Hans-Peter Kriegel,et al.  Towards an effective cooperation of the user and the computer for classification , 2000, KDD '00.

[69]  Basilis Boutsinas,et al.  Distributed Mining of Association Rules Based on Reducing the Support Threshold , 2008, Int. J. Artif. Intell. Tools.