Spatial data classification using decision tree models

Classification is the process of designing a model that distinguishes data classes, for the purpose of being able to use the model to assign the class of unlabeled objects. The task of classification is to assign a class to an object from a set of classes. Traditional classification models often make an assumption of i.i.d. (“independent identical distributions”) but this is not true in case of spatial data. “Tobler's first law of geography states that: everything is related to everything else but nearby things are more related than distant things”. So in spatial classification, spatial information also taken into consideration while predicting the class label, because attribute value of neighboring object may also be relevant for the class membership. In this work modified ID3 is used for classification. ID3 algorithm uses information gain as splitting criteria. This work modifies the splitting criteria for incorporating spatial information of neighboring objects which helps in predicting class label of objects. In modification of splitting criteria entropy is being used. Entropy shows level of randomness which is main concept behind this work. Experiment has been done on hyperspectral image data such as India Pines, Salinas, KSC and Botswana and comparison has been done between results of original ID3 and modified ID3. It is found that modified ID3 show improved accuracy over original ID3 algorithm. This proposed model can give better results where spatial information plays significant role.

[1]  John F. Roddick,et al.  Geographic Data Mining and Knowledge Discovery , 2001 .

[2]  Karine Zeitouni,et al.  Join Indices as a Tool for Spatial Data Mining , 2000, TSDM.

[3]  Weili Wu,et al.  Spatial contextual classification and prediction models for mining geospatial data , 2002, IEEE Trans. Multim..

[4]  Shashi Shekhar,et al.  Focal-Test-Based Spatial Decision Tree Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[5]  Razali Yaakob,et al.  An extended ID3 decision tree algorithm for spatial data , 2011, Proceedings 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services.