HC-DT/SVM: a tightly coupled hybrid decision tree and support vector machines algorithm with application to land cover change detections

Change detection techniques have been widely used in satellite based environmental monitoring. Multi-date classification is an important change detection technique in remote sensing. In this study, we propose a hybrid algorithm called HC-DT/SVM, that tightly couples a Decision Tree (DT) algorithm and a Support Vector Machine (SVM) algorithm for land cover change detections. We aim at improving the interpretability of the classification results and classification accuracies simultaneously. The hybrid algorithm first constructs a DT classifier using all the training samples and then sends the samples under the ill-classified decision tree branches to a SVM classifier for further training. The ill-classified decision tree branches are linked to the SVM classifier and testing samples are classified jointly by the linked DT and SVM classifiers. Experiments using a dataset that consists of two Landsat TM scenes of southern China region show that the hybrid algorithm can significantly improve the classification accuracies of the classic DT classifier and improve its interpretability at the same time.

[1]  David M. Mark,et al.  Natural Language Understanding of Spatial Relations Between Linear Geographic Objects , 2007, Spatial Cogn. Comput..

[2]  P. A. Sheppard,et al.  Atmospheric Diffusion , 1962, Nature.

[3]  Y.-D. Kim,et al.  Neural-edge-based vehicle detection and traffic parameter extraction , 2004, Image Vis. Comput..

[4]  J. Chan,et al.  Detecting the nature of change in an urban environment : A comparison of machine learning algorithms , 2001 .

[5]  S. Levinson,et al.  Considerations in dynamic time warping algorithms for discrete word recognition , 1978 .

[6]  Thomi Pilioura,et al.  An Overview of Standards and Related Technology in Web Services , 2002, Distributed and Parallel Databases.

[7]  A. Corradini,et al.  Dynamic time warping for off-line recognition of a small gesture vocabulary , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[8]  C. Tøttrup Deforestation in the upper Ca river basin in North Central Vietnam: a remote sensing and GIS perspective. , 2002 .

[9]  Claudia Bauzer Medeiros,et al.  Specification of a framework for semantic annotation of geospatial data on the web , 2009, SIGSPACIAL.

[10]  Vit Niennattrakul,et al.  On Clustering Multimedia Time Series Data Using K-Means and Dynamic Time Warping , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[11]  K. Seto,et al.  Comparing ARTMAP Neural Network with the Maximum-Likelihood Classifier for Detecting Urban Change , 2003 .

[12]  D. Roberts,et al.  Mapping forest degradation in the Eastern Amazon from SPOT 4 through spectral mixture models , 2003 .

[13]  Barry Haack,et al.  Integrating multisensor data and RADAR texture measures for land cover mapping , 2000 .

[14]  Hassiba Nemmour,et al.  Multiple support vector machines for land cover change detection: An application for mapping urban extensions , 2006 .

[15]  D. Lu,et al.  Change detection techniques , 2004 .

[16]  Aaron E. Rosenberg,et al.  Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[17]  Kevin S. McCurley,et al.  Geospatial mapping and navigation of the web , 2001, WWW '01.

[18]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[19]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[20]  Duane A. Haugen,et al.  PROJECT PRAIRIE GRASS. A FIELD PROGRAM IN DIFFUSION. VOLUME 3 , 1959 .

[21]  M. Bauer,et al.  Land cover classification and change analysis of the Twin Cities (Minnesota) Metropolitan Area by multitemporal Landsat remote sensing , 2005 .

[22]  Sue Ellen Haupt,et al.  Validation of a Receptor–Dispersion Model Coupled with a Genetic Algorithm Using Synthetic Data , 2006 .

[23]  Alan H. Strahler,et al.  Maximizing land cover classification accuracies produced by decision trees at continental to global scales , 1999, IEEE Trans. Geosci. Remote. Sens..

[24]  Michael Gertz,et al.  VDM-RS: A visual data mining system for exploring and classifying remotely sensed images , 2009, Comput. Geosci..

[25]  Hanan Samet,et al.  NewsStand: a new view on news , 2008, GIS '08.

[26]  Hanan Samet,et al.  Geotagging with local lexicons to build indexes for textually-specified spatial data , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[27]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[28]  Vania Bogorny,et al.  Enhancing spatial association rule mining in geographic databases , 2006 .

[29]  Paul M. Mather,et al.  Some issues in the classification of DAIS hyperspectral data , 2006 .

[30]  Le Gruenwald,et al.  A Successive Decision Tree Approach to Mining Remotely Sensed Image Data , 2007 .

[31]  Xingquan Zhu,et al.  Knowledge Discovery and Data Mining: Challenges and Realities , 2007 .

[32]  Xiao Zhang,et al.  Extracting Route Directions from Web Pages , 2009, WebDB.

[33]  Interoperating Giss Interoperating Giss Report of a Specialist Meeting Held under the Auspices of the Varenius Project Panel on Computational Implementations of Geographic Concepts , 1998 .

[34]  Miquel Sànchez-Marrè,et al.  GESCONDA: An intelligent data analysis system for knowledge discovery and management in environmental databases , 2006, Environ. Model. Softw..

[35]  Eamonn J. Keogh,et al.  Scaling up Dynamic Time Warping to Massive Dataset , 1999, PKDD.

[36]  Adrian Grajdeanu,et al.  Characterization of atmospheric contaminant sources using Adaptive Evolutionary Algorithms , 2010 .

[37]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[38]  Azriel Rosenfeld,et al.  Optimal edge-based shape detection , 2002, IEEE Trans. Image Process..

[39]  Paul M. Mather,et al.  Support vector machines for classification in remote sensing , 2005 .

[40]  Q. Guo,et al.  A comparison of standard and hybrid classifier methods for mapping hardwood mortality in areas affected by sudden oak death , 2004 .

[41]  Manfred Ehlers,et al.  Photogrammetric Engineering and Remote Sensing , 2007 .

[42]  Jonathan Cheung-Wai Chan,et al.  Multiple Criteria for Evaluating Machine Learning Algorithms for Land Cover Classification from Satellite Data , 2000 .

[43]  Luis Gravano,et al.  Computing Geographical Scopes of Web Resources , 2000, VLDB.

[44]  Claudia Bauzer Medeiros,et al.  The Web as a Data Source for Spatial Databases , 2003, GeoInfo.

[45]  John R. Jensen,et al.  A change detection model based on neighborhood correlation image analysis and decision tree classification , 2005 .

[46]  Ashbindu Singh,et al.  Review Article Digital change detection techniques using remotely-sensed data , 1989 .

[47]  S. Prince,et al.  Remote sensing of savanna vegetation changes in Eastern Zambia 1972-1989 , 2000 .

[48]  Steven Brown,et al.  Zend Framework in Action , 2009 .

[49]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[50]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[51]  David M. Rizzo,et al.  Sudden Oak Death , 2003 .

[52]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[53]  Maria T. Pazienza,et al.  Information Extraction , 2002, Lecture Notes in Computer Science.

[54]  Randolph H. Wynne,et al.  Comparing farmer‐based and satellite‐derived deforestation estimates in the Amazon basin using a hybrid classifier , 2007 .

[55]  Carlos Denner dos Santos,et al.  Open source software projects' attractiveness, activeness, and efficiency as a path to software quality: an empirical evaluation of their relationships and causes , 2009 .

[56]  Lei Tan,et al.  Interoperability for Geospatial Analysis: a Semantics and Ontology-based Approach , 2007, ADC.

[57]  Sue Ellen Haupt,et al.  Improving pollutant source characterization by better estimating wind direction with a genetic algorithm , 2007 .

[58]  Brad D. Jokisch,et al.  One Last Stand? Forests and Change on Ecuador's Eastern Cordillera , 2002 .

[59]  Ari Rappoport,et al.  Geo-mining: Discovery of Road and Transport Networks Using Directional Patterns , 2009, EMNLP.

[60]  Qihao Weng,et al.  A survey of image classification methods and techniques for improving classification performance , 2007 .

[61]  Yufang Zhang,et al.  Robust background image generation and vehicle 3D detection and tracking , 2004, Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No.04TH8749).

[62]  Peter Fröhlich,et al.  A mobile application framework for the geospatial web , 2007, WWW '07.

[63]  Sue Ellen Haupt,et al.  A demonstration of coupled receptor/dispersion modeling with a genetic algorithm , 2004 .

[64]  Jefersson Alex dos Santos,et al.  Annotating data to support decision-making: a case study , 2010, GIR.

[65]  Steven R. Hanna,et al.  Uncertainties in source emission rate estimates using dispersion models , 1990 .

[66]  G Johannesson,et al.  Dynamic Bayesian Models via Monte Carlo - An Introduction with Examples - , 2004 .

[67]  Sucharita Gopal,et al.  Uncertainty and Confidence in Land Cover Classification Using a Hybrid Classifier Approach , 2004 .

[68]  S. Arya Air Pollution Meteorology and Dispersion , 1998 .

[69]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[70]  Tyler Mitchell,et al.  Web mapping illustrated - using open source GIS toolkits , 2005 .

[71]  Meinard Müller,et al.  An Efficient Multiscale Approach to Audio Synchronization , 2006, ISMIR.

[72]  Eric C. Brown de Colstoun,et al.  Improving global scale land cover classifications with multi-directional POLDER data and a decision tree classifier , 2006 .

[73]  George M. Church,et al.  Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[74]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[75]  K. S. Rao Source estimation methods for atmospheric dispersion , 2007 .

[76]  Guido Cervone,et al.  Monte Carlo source detection of atmospheric emissions and error functions analysis , 2010, Comput. Geosci..

[77]  L. S. Davis,et al.  An assessment of support vector machines for land cover classi(cid:142) cation , 2002 .

[78]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[79]  Barbara P. Buttenfield,et al.  A Dynamic Architecture for Distributing Geographic Information Services , 2002, Trans. GIS.

[80]  Nicolas W. Hengartner,et al.  Stochastic event reconstruction of atmospheric contaminant dispersion using Bayesian inference , 2008 .

[81]  Barry Haack,et al.  Radar spatial considerations for land cover extraction , 2005 .

[82]  A. Prasad,et al.  Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction , 2006, Ecosystems.

[83]  Barry Haack,et al.  A Comparison of Land Use/Cover Mapping with Varied Radar Incident Angles and Seasons , 2007 .

[84]  Fabio Rinaldi,et al.  Multilayer annotations in Parmenides , 2003 .

[85]  Kuo-Chin Fan,et al.  Vehicle Detection Using Normalized Color and Edge Map , 2007, IEEE Transactions on Image Processing.

[86]  B. Lees,et al.  Combining Non-Parametric Models for Multisource Predictive Forest Mapping , 2004 .

[87]  Jacob Scarchanski,et al.  UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL PROGRAMA DE PÓS-GRADUAÇÃO EM COMPUTAÇÃO , 2000 .

[88]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[89]  Craig A. Knoblock,et al.  Quality-driven geospatial data integration , 2007, GIS.

[90]  Liang Xu,et al.  Achieving interoperability for integration of heterogeneous COTS geographic information systems , 2002, GIS '02.

[91]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[92]  Tieniu Tan,et al.  Comparison of Similarity Measures for Trajectory Clustering in Outdoor Surveillance Scenes , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[93]  Satoshi Nakagawa,et al.  Automated detection of human for visual surveillance system , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[94]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[95]  Alan H. Strahler,et al.  Global land cover mapping from MODIS: algorithms and early results , 2002 .

[96]  Richard O. Duda,et al.  Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.

[97]  C. Brodley,et al.  Decision tree classification of land cover from remotely sensed data , 1997 .

[98]  K. Chomitz,et al.  Roads, land use, and deforestation : a spatial model applied to Belize , 1996 .

[99]  Fernando Diaz,et al.  A case study of using geographic cues to predict query news intent , 2009, GIS.

[100]  Gary Sherman,et al.  Desktop GIS: Mapping the Planet with Open Source Tools , 2008 .

[101]  Masatoshi Arikawa,et al.  A Geocoding Method for Natural Route Descriptions Using Sidewalk Network Databases , 2004, W2GIS.

[102]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[103]  Jun Li,et al.  A video-based real-time vehicle detection method by classified background learning , 2007 .

[104]  Claudia Bauzer Medeiros,et al.  A standards-based framework to foster geospatial data and process interoperability , 2009, Journal of the Brazilian Computer Society.

[105]  R. Alba-Flores Evaluation of the Use of High-Resolution Satellite Imagery in Transportation Applications , 2005 .

[106]  Eric F. Lambin,et al.  Spatial modelling of deforestation in southern Cameroon - Spatial disaggregation of diverse deforestation processes , 1997 .

[107]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[108]  Luca Delle Monache,et al.  Bayesian Inference and Markov Chain Monte Carlo Sampling to Reconstruct a Contaminant Source on a Continental Scale , 2008 .

[109]  Steven R. Hanna,et al.  Evaluations of CALPUFF, HPAC, and VLSTRACK with Two Mesoscale Field Datasets , 2003 .

[110]  John W. Holt,et al.  Mapping land cover types in Amazon basin using 1 km JERS-1 mosaic , 1999, IEEE 1999 International Geoscience and Remote Sensing Symposium. IGARSS'99 (Cat. No.99CH36293).

[111]  J. Davies,et al.  Hazardous gas model evaluation with field observations , 1995 .

[112]  H. Hurni,,et al.  Implications of Land Use and Land Cover Dynamics for Mountain Resource Degradation in the Northwestern Ethiopian Highlands , 2001 .

[113]  D. Steudler,et al.  FLOSS in cadastre and land registration: opportunities and risks. , 2010 .