Spatial Data Mining by Decision Trees

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first Join materialization favors the processing time in spite of memory space, whereas the second Querying on the fly different tablespromotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed. Keywords—C4.5 Algorithm, Decision trees, S-CART, Spatial data mining.

[1]  Nadjim Chelghoum,et al.  Fouille de données spatiales par arbre de décision multi-thème , 2002, EGC.

[2]  Yan Huang,et al.  Discovering Spatial Co-location Patterns: A Summary of Results , 2001, SSTD.

[3]  Karine Zeitouni,et al.  Utilisation des treillis de Galois pour l'extraction et la visualisation des règles d'association spatiales , 2006, INFORSID.

[4]  G. Manikandan,et al.  MINING OF SPATIAL CO-LOCATION PATTERN IMPLEMENTATION BY FP GROWTH , 2012 .

[5]  P. Diggle,et al.  Spatial point pattern analysis and its application in geographical epidemiology , 1996 .

[6]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[7]  Nadjim Chelghoum,et al.  Fouille de données spatiales. Approche basée sur la programmation logique inductive , 2006, EGC.

[8]  Florent R. Madelaine,et al.  Mémoire d'habilitation à diriger des recherches , 2012 .

[9]  Carl Franklin,et al.  An Introduction to Geographic Information Systems: Linking Maps to Databases [and] Maps for the Rest of Us: Affordable and Fun. , 1992 .

[10]  Razali Yaakob,et al.  An extended ID3 decision tree algorithm for spatial data , 2011, Proceedings 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services.

[11]  Hans-Peter Kriegel,et al.  Algorithms for Characterization and Trend Detection in Spatial Databases , 1998, KDD.

[12]  Nadjim Chelghoum,et al.  Mise en oeuvre des méthodes de fouille de données spatiales - Alternatives et performances , 2004, EGC.