Extended Functional Dependencies as a Basis for Linguistic Summaries

This paper is concerned with knowledge discovery in databases and linguistic summaries of data. The summaries proposed here allow for a qualitative description of data (instead of the quantitative description given by a probabilistic approach) and they involve linguistic terms to obtain a wider coverage than Boolean summaries. They are based on extended functional dependencies and are situated in the framework of the relational model of data. Such summaries express a meta-knowledge about the database content according to the pattern “for any tuple t in relation R: the more A, the more B” (for instance: the taller the player, the higher his score in the NBA championship) where A and B are two linguistic terms. In addition, an algorithm to implement the discovery process (which takes advantage of properties of extended functional dependencies) is given. This algorithm is iterative and each tuple is successively considered in order to refine the set of valid summaries.