Selecting the best measures to discover quantitative association rules

The majority of the existing techniques to mine association rules typically use the support and the confidence to evaluate the quality of the rules obtained. However, these two measures may not be sufficient to properly assess their quality due to some inherent drawbacks they present. A review of the literature reveals that there exist many measures to evaluate the quality of the rules, but that the simultaneous optimization of all measures is complex and might lead to poor results. In this work, a principal components analysis is applied to a set of measures that evaluate quantitative association rules' quality. From this analysis, a reduced subset of measures has been selected to be included in the fitness function in order to obtain better values for the whole set of quality measures, and not only for those included in the fitness function. This is a general-purpose methodology and can, therefore, be applied to the fitness function of any algorithm. To validate if better results are obtained when using the function fitness composed of the subset of measures proposed here, the existing QARGA algorithm has been applied to a wide variety of datasets. Finally, a comparative analysis of the results obtained by means of the application of QARGA with the original fitness function is provided, showing a remarkable improvement when the new one is used.

[1]  Yasuhiko Morimoto,et al.  Mining optimized association rules for numeric attributes , 1996, J. Comput. Syst. Sci..

[2]  Alicia Troncoso Lora,et al.  An evolutionary algorithm to discover quantitative association rules in multidimensional time series , 2011, Soft Comput..

[3]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[4]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[5]  Bilal Alatas,et al.  MODENAR: Multi-objective differential evolution algorithm for mining numeric association rules , 2008, Appl. Soft Comput..

[6]  Chengqi Zhang,et al.  Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support , 2009, Expert Syst. Appl..

[7]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[8]  Emilio Corchado,et al.  Hybrid intelligent algorithms and applications , 2010, Inf. Sci..

[9]  José C. Riquelme,et al.  Mining Numeric Association Rules with Genetic Algorithms , 2001 .

[10]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[11]  María José del Jesús,et al.  On the discovery of association rules by means of evolutionary algorithms , 2011, WIREs Data Mining Knowl. Discov..

[12]  M.A.W. Houtsma,et al.  Set-Oriented Mining for Association Rules , 1993, ICDE 1993.

[13]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[14]  Juan M. Corchado,et al.  Hybrid learning machines , 2009, Neurocomputing.

[15]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[16]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[17]  Jesús Alcalá-Fdez,et al.  A multi-objective evolutionary algorithm for mining quantitative association rules , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[18]  Gilles Venturini,et al.  SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts , 1993, ECML.

[19]  Alicia Troncoso Lora,et al.  Computational Intelligence Techniques for Predicting Earthquakes , 2011, HAIS.

[20]  Ester Bernadó-Mansilla,et al.  First approach toward on-line evolution of association rules with learning classifier systems , 2008, GECCO '08.

[21]  Reda Alhajj,et al.  Genetic algorithm based framework for mining fuzzy association rules , 2005, Fuzzy Sets Syst..

[22]  Emilio Corchado,et al.  Editorial: New trends and applications on hybrid artificial intelligence systems , 2012, Neurocomputing.

[23]  R. Bone Discovery , 1938, Nature.

[24]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[25]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[26]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[27]  Baoping Yan,et al.  Mining Quantitative Association Rules on Overlapped Intervals , 2005, ADMA.

[28]  Alicia Troncoso Lora,et al.  Pattern recognition to forecast seismic time series , 2010, Expert Syst. Appl..

[29]  Jacinto Mata Vázquez,et al.  An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an a priori discretization , 2012, Expert Syst. Appl..

[30]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[31]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[32]  Bhabesh Nath,et al.  Multi-objective rule mining using genetic algorithms , 2004, Inf. Sci..

[33]  J HamiltonHoward,et al.  Interestingness measures for data mining , 2006 .

[34]  Jesús Alcalá-Fdez,et al.  Analysis of the Effectiveness of the Genetic Algorithms based on Extraction of Association Rules , 2010, Fundam. Informaticae.

[35]  Reda Alhajj,et al.  Utilizing Genetic Algorithms to Optimize Membership Functions for Fuzzy Weighted Association Rules Mining , 2006, Applied Intelligence.

[36]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[37]  J. L. Camacho,et al.  Evolutionary association rules for total ozone content modeling from satellite observations , 2011 .

[38]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[39]  Alok Kumar Jagadev,et al.  Multi-objective Genetic Algorithm for Association Rule Mining Using a Homogeneous Dedicated Cluster of Workstations , 2006 .

[40]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[41]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[43]  Erhan Akin,et al.  An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules , 2006, Soft Comput..

[44]  Maybin K. Muyeba,et al.  An algorithm to mine general association rules from tabular data , 2007, Inf. Sci..

[45]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[46]  Jitender S. Deogun,et al.  Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method , 2004, Rough Sets and Current Trends in Computing.

[47]  Behrouz Minaei-Bidgoli,et al.  Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence , 2011, Expert Syst. Appl..

[48]  Arun N. Swami,et al.  Set-oriented mining for association rules in relational databases , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[49]  Erhan Akin,et al.  Rough particle swarm optimization and its applications in data mining , 2008, Soft Comput..

[50]  Ansaf Salleb-Aouissi,et al.  QuantMiner: A Genetic Algorithm for Mining Quantitative Association Rules , 2007, IJCAI.