论文信息 - A New and Useful Syntactic Restriction on Rule Semantics for Tabular Datasets

A New and Useful Syntactic Restriction on Rule Semantics for Tabular Datasets

Different rule semantics have been defined successively in many contexts such as functional dependencies in databases or association rules in data mining to mention a few. In this paper, we focus on the class of rule semantics for tabular data for which Armstrong's axiom system is sound and complete, so-called \emph{well-formed semantics}. The main contribution of this paper is to show that an \emph{equivalence} does exist between some syntactic restrictions on the natural definition of a given semantics and the fact that this semantics is well-formed. From a practical point of view, this equivalence allows to prove easily whether or not a new semantics is well-formed. Moreover, the same reasoning on rules can be performed over any well-formed semantics. We also point out the relationship between our generic definition of rule satisfaction and the underlying data mining problem, i.e. given a well-formed semantics and a relation, discover a cover of rules satisfied in this relation. This work takes its roots from a bioinformatics application, the discovery of gene regulatory networks from gene expression data.

Jean-Marc Petit | Marie Pailloux

[1] L. Beran,et al. [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[2] Anthony K. H. Tung,et al. FARMER: finding interesting rule groups in microarray datasets , 2004, SIGMOD '04.

[3] Carolina Ruiz,et al. Distance-enhanced association rules for gene expression , 2003, BIOKDD.

[4] Gerd Stumme,et al. Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets , 2000, Computational Logic.

[5] C. Becquet,et al. Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data , 2002, Genome Biology.

[6] Vincent Duquenne,et al. Familles minimales d'implications informatives résultant d'un tableau de données binaires , 1986 .

[7] Dan A. Simovici,et al. Generating an informative cover for association rules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[8] Jeffrey D. Ullman,et al. Principles of Database Systems , 1980 .

[9] Jean-Marc Petit,et al. Functional and approximate dependencies mining: databases and FCA point of view , 2002 .

[10] Jean-Marc Petit,et al. Vers différents types de règles pour les données d'expression de gènes - application à des données de tumeurs mammaires , 2004, INFORSID.

[11] Philip A. Bernstein,et al. Computational problems related to the design of normal form relational schemas , 1979, TODS.