A New and Useful Syntactic Restriction on Rule Semantics for Tabular Datasets

Different rule semantics have been defined successively in many contexts such as functional dependencies in databases or association rules in data mining to mention a few. In this paper, we focus on the class of rule semantics for tabular data for which Armstrong's axiom system is sound and complete, so-called \emph{well-formed semantics}. The main contribution of this paper is to show that an \emph{equivalence} does exist between some syntactic restrictions on the natural definition of a given semantics and the fact that this semantics is well-formed. From a practical point of view, this equivalence allows to prove easily whether or not a new semantics is well-formed. Moreover, the same reasoning on rules can be performed over any well-formed semantics. We also point out the relationship between our generic definition of rule satisfaction and the underlying data mining problem, i.e. given a well-formed semantics and a relation, discover a cover of rules satisfied in this relation. This work takes its roots from a bioinformatics application, the discovery of gene regulatory networks from gene expression data.

[1]  L. Beran,et al.  [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[2]  Anthony K. H. Tung,et al.  FARMER: finding interesting rule groups in microarray datasets , 2004, SIGMOD '04.

[3]  Carolina Ruiz,et al.  Distance-enhanced association rules for gene expression , 2003, BIOKDD.

[4]  Gerd Stumme,et al.  Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets , 2000, Computational Logic.

[5]  C. Becquet,et al.  Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data , 2002, Genome Biology.

[6]  Vincent Duquenne,et al.  Familles minimales d'implications informatives résultant d'un tableau de données binaires , 1986 .

[7]  Dan A. Simovici,et al.  Generating an informative cover for association rules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[8]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[9]  Jean-Marc Petit,et al.  Functional and approximate dependencies mining: databases and FCA point of view , 2002 .

[10]  Jean-Marc Petit,et al.  Vers différents types de règles pour les données d'expression de gènes - application à des données de tumeurs mammaires , 2004, INFORSID.

[11]  Philip A. Bernstein,et al.  Computational problems related to the design of normal form relational schemas , 1979, TODS.

[12]  Hannu Toivonen,et al.  Efficient discovery of functional and approximate dependencies using partitions , 1998, Proceedings 14th International Conference on Data Engineering.

[13]  Howard J. Hamilton,et al.  Basic Association Rules , 2004, SDM.

[14]  Viet Phan Luong The Representative Basis for Association Rules , 2001, ICDM.

[15]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[16]  W. W. Armstrong,et al.  Dependency Structures of Data Base Relationships , 1974, IFIP Congress.

[17]  Georg Gottlob,et al.  Investigations on Armstrong relations, dependency inference, and excluded functional dependencies , 1990, Acta Cybern..

[18]  Rosine Cicchetti,et al.  FUN: An Efficient Algorithm for Mining Functional and Embedded Dependencies , 2001, ICDT.

[19]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[20]  David Maier Minimum Covers in Relational Database Model , 1980, JACM.

[21]  MannilaHeikki,et al.  Algorithms for inferring functional dependencies from relations , 1994 .

[22]  Chad Creighton,et al.  Mining gene expression databases for association rules , 2003, Bioinform..

[23]  Jean-Marc Petit,et al.  Towards Ad-Hoc Rule Semantics for Gene Expression Data , 2005, ISMIS.

[24]  Richard Statman,et al.  On the Structure of Armstrong Relations for Functional Dependencies , 1984, JACM.

[25]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[26]  Heikki Mannila,et al.  Approximate Inference of Functional Dependencies from Relations , 1995, Theor. Comput. Sci..

[27]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[28]  János Demetrovics,et al.  Some Remarks On Generating Armstrong And Inferring Functional Dependencies Relation , 1995, Acta Cybern..

[29]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[30]  Jean-Marc Petit,et al.  Notion de sémantiques bien-formées pour les règles , 2005, EGC.

[31]  Viet Phan-Luong The representative basis for association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[32]  A I Saeed,et al.  TM4: a free, open-source system for microarray data management and analysis. , 2003, BioTechniques.

[33]  Elena Baralis,et al.  Designing Templates for Mining Association Rules , 2004, Journal of Intelligent Information Systems.

[34]  Georg Gottlob,et al.  Investigations on Armstrong Relations , 1990 .

[35]  A BernsteinPhilip,et al.  Computational problems related to the design of normal form relational schemas , 1979 .

[36]  Luís Moniz Pereira,et al.  Computational Logic — CL 2000 , 2000, Lecture Notes in Computer Science.