Selecting Relevant Association Rules From Imperfect Data

Association Rule Mining (ARM) in the context of imperfect data (e.g. imprecise data) has received little attention so far despite the prevalence of such data in a wide range of real-world applications. In this work, we present an ARM approach that can be used to handle imprecise data and derive imprecise rules. Based on evidence theory and Multiple Criteria Decision Analysis, the proposed approach relies on a selection procedure for identifying the most relevant rules while considering information characterizing their interestingness. The several measures of interestingness defined for comparing the rules as well as the selection procedure are presented. We also show how a priori knowledge about attribute values defined into domain taxonomies can be used to (i) ease the mining process, and to (ii) help identifying relevant rules for a domain of interest. Our approach is illustrated using a concrete simplified case study related to humanitarian projects analysis.

[1]  Engelbert Mephu Nguifo,et al.  Ranking and Selecting Association Rules Based on Dominance Relationship , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[2]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[3]  Mehdi Toloo,et al.  A new method for ranking discovered rules from data mining by DEA , 2009, Expert Syst. Appl..

[4]  Khaled Mellouli,et al.  Frequent Itemset Mining from Databases Including One Evidential Attribute , 2008, SUM.

[5]  Ronald Fagin,et al.  A new approach to updating beliefs , 1990, UAI.

[6]  Bernard Roy,et al.  Classement et choix en présence de points de vue multiples , 1968 .

[7]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[8]  Thierry Denoeux,et al.  Conditioning in Dempster-Shafer Theory: Prediction vs. Revision , 2012, Belief Functions.

[9]  Soung Hie Kim,et al.  Prioritization of association rules in data mining: Multiple criteria decision approach , 2005, Expert Syst. Appl..

[10]  Tony Veale,et al.  An Intrinsic Information Content Metric for Semantic Similarity in WordNet , 2004, ECAI.

[11]  Yassine Djouadi,et al.  Mining Association Rules under Imprecision and Vagueness: towards a Possibilistic Approach , 2007, 2007 IEEE International Fuzzy Systems Conference.

[12]  Tarik Agouti,et al.  Multi-agent-based modeling for extracting relevant association rules using a multi-criteria analysis approach , 2016, Vietnam Journal of Computer Science.

[13]  K. Premaratne,et al.  Rule mining and classification in imperfect databases , 2005, 2005 7th International Conference on Information Fusion.

[14]  Hiep Xuan Huynh,et al.  Finding the Most Interesting Association Rules by Aggregating Objective Interestingness Measures , 2009, PKAW.

[15]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[16]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[17]  Mu-Chen Chen,et al.  Ranking discovered rules from data mining with multiple criteria by data envelopment analysis , 2007, Expert Syst. Appl..

[18]  Arthur P. Dempster,et al.  Upper and Lower Probabilities Induced by a Multivalued Mapping , 1967, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[19]  Philippe Lenca,et al.  A Clustering of Interestingness Measures , 2004, Discovery Science.

[20]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[21]  Bernard Roy,et al.  Determining the weights of criteria in the ELECTRE type methods with a revised Simos' procedure , 2002, Eur. J. Oper. Res..

[22]  Dimitris Kanellopoulos,et al.  Association Rules Mining: A Recent Overview , 2006 .

[23]  Sadok Ben Yahia,et al.  Evidential data mining: precise support and confidence , 2016, Journal of Intelligent Information Systems.

[24]  B. B. Yaghlane,et al.  A New Algorithm for Mining Frequent Itemsets from Evidential Databases , 2008 .

[25]  Wynne Hsu,et al.  Analyzing the Subjective Interestingness of Association Rules , 2000, IEEE Intell. Syst..

[26]  Tzung-Pei Hong,et al.  Fuzzy data mining for interesting generalized association rules , 2003, Fuzzy Sets Syst..

[27]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.