Learning Approximate Thematic Maps from Labeled Geospatial Data

Building accurate thematic maps which show distribution of a feature over a geographic area is a challenging task when the sample dataset is limited in size and distribution. We propose the classification of these geospatial datasets as a promising approach towards building approximate thematic maps. However, choosing an appropriate classification method that considers spatial autocorrelation in data is not trivial. This paper investigates the application of different classification methods on real-world spatial datasets. We study how factors such as distribution of the training data, neighborhood relationships and geometry of the original map can affect the accuracy of the generated map. Consequently, we report on measurements comparing the accuracy of the investigated methods on different datasets. Our experimental setup utilizes a spatial database system to compare the regions of the approximate map with those of the original accurate map. According to our experimental results, a Support Vector Machine (SVM) with a radial basis kernel outperforms all the other investigated methods.