Visualization of Rules in Rule-Based Classifiers

Interpretation and visualization of the classification models are important parts of machine learning. Rule-based classifiers often contain too many rules to be easily interpreted by humans, and methods for post-classification analysis of the rules are needed. Here we present a strategy for circular visualization of sets of classification rules. The Circos software was used to generate graphs showing all pairs of conditions that were present in the rules as edges inside a circle. We showed using simulated data that all two-way interactions in the data were found by the classifier and displayed in the graph, although the single attributes were constructed to have no correlation to the decision class. For all examples we used rules trained using the rough set theory, but the visualization would by applicable to any sort of classification rules. This method for rule visualization may be useful for applications where interaction terms are expected, and the size of the model limits the interpretability.

[1]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[2]  Jan Komorowski,et al.  A Rough Set-Based Model of HIV-1 Reverse Transcriptase Resistome , 2009, Bioinformatics and biology insights.

[3]  Sadaaki Miyamoto,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[4]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[5]  Dennis DeCoste,et al.  Visualizing data mining models , 2001 .

[6]  Pak Chung Wong,et al.  Guest Editor's Introduction: Visual Data Mining , 1999, IEEE Computer Graphics and Applications.

[7]  Dario Bruzzese,et al.  Visual Mining of Association Rules , 2008, Visual Data Mining.

[8]  Maria Francesca Costabile,et al.  Visualizing Association Rules in a Framework for Visual Data Mining , 2005, From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments.

[9]  Jan M. Zytkow,et al.  Handbook of Data Mining and Knowledge Discovery , 2002 .

[10]  Matthias Hemmje,et al.  From Integrated Publication and Information Systems to Information and Knowledge Environments , 2005 .

[11]  Roman Słowiński,et al.  Generalized Decision Algorithms, Rough Inference Rules, and Flow Graphs , 2002, Rough Sets and Current Trends in Computing.

[12]  Emilio Corchado,et al.  Intelligent Data Engineering and Automated Learning - IDEAL 2006, 7th International Conference, Burgos, Spain, September 20-23, 2006, Proceedings , 2006, IDEAL.

[13]  J. Komorowski,et al.  Combinations of Histone Modifications Mark Exon Inclusion Levels , 2012, PloS one.

[14]  Jan Komorowski,et al.  Monte Carlo Feature Selection and Interdependency Discovery in Supervised Classification , 2010, Advances in Machine Learning II.

[15]  Andreas Wierse,et al.  Information Visualization in Data Mining and Knowledge Discovery , 2001 .

[16]  John F. Roddick,et al.  Visualisation of Temporal Interval Association Rules , 2000, IDEAL.

[17]  Jan Komorowski,et al.  Computational proteomics analysis of HIV‐1 protease interactome , 2007, Proteins.

[18]  Janusz Kacprzyk,et al.  Advances in Machine Learning II , 2010 .