A New Multiobjective Evolutionary Algorithm for Mining a Reduced Set of Interesting Positive and Negative Quantitative Association Rules

Most of the algorithms for mining quantitative association rules focus on positive dependencies without paying particular attention to negative dependencies. The latter may be worth taking into account, however, as they relate the presence of certain items to the absence of others. The algorithms used to extract such rules usually consider only one evaluation criterion in measuring the quality of generated rules. Recently, some researchers have framed the process of extracting association rules as a multiobjective problem, allowing us to jointly optimize several measures that can present different degrees of trade-off depending on the dataset used. In this paper, we propose MOPNAR, a new multiobjective evolutionary algorithm, in order to mine a reduced set of positive and negative quantitative association rules with low computational cost. To accomplish this, our proposal extends a recent multiobjective evolutionary algorithm based on decomposition to perform an evolutionary learning of the intervals of the attributes and a condition selection for each rule, while introducing an external population and a restarting process to store all the nondominated rules found and to improve the diversity of the rule set obtained. Moreover, this proposal maximizes three objectives-comprehensibility, interestingness, and performance-in order to obtain rules that are interesting, easy to understand, and provide good coverage of the dataset. The effectiveness of the proposed approach is validated over several real-world datasets.

[1]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[2]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[3]  Bilal Alatas,et al.  MODENAR: Multi-objective differential evolution algorithm for mining numeric association rules , 2008, Appl. Soft Comput..

[4]  Alex A. Freitas,et al.  Discovering comprehensible classification rules with a genetic algorithm , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[5]  J. Wenny Rahayu,et al.  Mining Hierarchical Negative Association Rules , 2010, Comput. Syst. Sci. Eng..

[6]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[7]  Kaisa Miettinen,et al.  Nonlinear multiobjective optimization , 1998, International series in operations research and management science.

[8]  Rajeev Motwani,et al.  Beyond Market Baskets: Generalizing Association Rules to Dependence Rules , 1998, Data Mining and Knowledge Discovery.

[9]  Jesús Alcalá-Fdez,et al.  Analysis of the Effectiveness of the Genetic Algorithms based on Extraction of Association Rules , 2010, Fundam. Informaticae.

[10]  Qingfeng Chen,et al.  Discovery of Structural and Functional Features in RNA Pseudoknots , 2009, IEEE Transactions on Knowledge and Data Engineering.

[11]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[12]  Wei-Pang Yang,et al.  A discretization algorithm based on Class-Attribute Contingency Coefficient , 2008, Inf. Sci..

[13]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[14]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[15]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[16]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[17]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[18]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[19]  Bhabesh Nath,et al.  Multi-objective rule mining using genetic algorithms , 2004, Inf. Sci..

[20]  Subhagata Chattopadhyay,et al.  Studying infant mortality rate: a data mining approach , 2011 .

[21]  Kwang-Il Ahn,et al.  Efficient Mining of Frequent Itemsets and a Measure of Interest for Association Rule Mining , 2004, J. Inf. Knowl. Manag..

[22]  Davy Janssens,et al.  Evaluating the performance of cost-based discretization versus entropy- and error-based discretization , 2006, Comput. Oper. Res..

[23]  José Cristóbal Riquelme Santos,et al.  An evolutionary algorithm to discover numeric association rules , 2002, SAC '02.

[24]  Qingfu Zhang,et al.  Multiobjective Optimization Problems With Complicated Pareto Sets, MOEA/D and NSGA-II , 2009, IEEE Transactions on Evolutionary Computation.

[25]  Jesús Alcalá-Fdez,et al.  A multi-objective evolutionary algorithm for mining quantitative association rules , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[26]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[27]  Chengqi Zhang,et al.  Association Rule Mining , 2002, Lecture Notes in Computer Science.

[28]  Ke Sun,et al.  Mining Weighted Association Rules without Preassigned Weights , 2008, IEEE Transactions on Knowledge and Data Engineering.

[29]  Daniel Sánchez,et al.  Measuring the accuracy and interest of association rules: A new framework , 2002, Intell. Data Anal..

[30]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[31]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[32]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[33]  Shichao Zhang,et al.  Association Rule Mining: Models and Algorithms , 2002 .

[34]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[35]  Erhan Akin,et al.  An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules , 2006, Soft Comput..

[36]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[37]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[38]  Chang-Hwan Lee A Hellinger-based discretization method for numeric attributes in classification learning , 2007, Knowl. Based Syst..

[39]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[40]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[41]  Xindong Wu,et al.  Efficient mining of both positive and negative association rules , 2004, TOIS.

[42]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[43]  Chengqi Zhang,et al.  Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support , 2009, Expert Syst. Appl..

[44]  José C. Riquelme,et al.  Mining Numeric Association Rules with Genetic Algorithms , 2001 .

[45]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[46]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[47]  Behrouz Minaei-Bidgoli,et al.  Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence , 2011, Expert Syst. Appl..