Using data mining techniques to analyze correspondences between user and scientific knowledge in an agricultural environment

The incorporation of technology is recognized as one of the main factors affecting the future evolution of agriculture. In this context, a big effort is being devoted to the development of systems to manage knowledge about the different aspects of cultivation, to be employed for decision-making purposes. An important problem here is that in many occasions, information and knowledge employed to make decisions about a certain topic come from different sources. This is actually the case in the soil setting with scientist and user knowledge. The fusion of information is needed in order to facilitate the analysis, comparison and exploitation of knowledge coming from different sources. One particular case is that of having two different classifications (partitions) of the same set of objects. A first step to integrate them is to study their possible correspondence. In this paper we introduce several kinds of possible correspondences between partitions, and we propose the use of data mining techniques to measure its accuracy. For that purpose, partitions are represented as relational tables, and correspondences are identified with association rules and approximate dependencies. The accuracies of the former are then measured by means of accuracy measures of the latter, and some results relating accuracy values to correspondence cases are shown. Finally, we provide some examples of application of our proposal in a real-world problem, the integration of user and scientific classification of soils, that is of primary interest for decision making in agricultural environments.

[1]  Rudolf Kruse,et al.  Fusion: General concepts and characteristics , 2001, Int. J. Intell. Syst..

[2]  Chris Clifton,et al.  SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks , 2000, Data Knowl. Eng..

[3]  Juan Manuel Serrano,et al.  Using fuzzy relational databases to represent agricultural and environmental information. An example within the scope of olive cultivation in Granada. , 2001 .

[4]  Serafín Moral,et al.  Merging databases: Problems and examples , 2001, Int. J. Intell. Syst..

[5]  J. Deckers,et al.  World Reference Base for Soil Resources , 1998 .

[6]  Stefan Kramer,et al.  Compression-Based Evaluation of Partial Determinations , 1995, KDD.

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  Paul De Bra,et al.  Horizontal Decompositions for Handling Exceptions to Functional Dependencies , 1982, Advances in Data Base Theory.

[9]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[10]  Wojciech Ziarko,et al.  The Discovery, Analysis, and Representation of Data Dependencies in Databases , 1991, Knowledge Discovery in Databases.

[11]  Daniel Sánchez,et al.  A New Framework to Assess Association Rules , 2001, IDA.

[12]  Charles Elkan,et al.  The Field Matching Problem: Algorithms and Applications , 1996, KDD.

[13]  Patrick Bosc,et al.  Functional dependencies revisited under graduality and imprecision , 1997, 1997 Annual Meeting of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.97TH8297).