Improving Land Cover Classification Using Genetic Programming

Genetic Programming (GP) is a powerful Machine Learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in Remote Sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs Feature Construction by evolving hyper-features from the original ones. In this work, we use the M3GP algorithm on several satellite images over different countries to perform binary classification of burnt areas and multiclass classification of land cover types. We add the evolved hyper-features to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (Decision Trees, Random Forests and XGBoost) on the multiclass classification datasets, with no significant effect on the binary classification ones. We show that adding the M3GP hyper-features to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI and NBR. We also compare the performance of the M3GP hyper-features in the binary classification problems with those created by other Feature Construction methods like FFX and EFS.

[1]  Christopher Conrad,et al.  Crop Type Classification Using Fusion of Sentinel-1 and Sentinel-2 Data: Assessing the Impact of Feature Selection, Optical Data Availability, and Parcel Sizes on the Accuracies , 2020, Remote. Sens..

[2]  Frank W. Davis,et al.  Geographic Object-Based Image Analysis Framework for Mapping Vegetation Physiognomic Types at Fine Scales in Neotropical Savannas , 2020, Remote. Sens..

[3]  Lucas Costa,et al.  A new visible band index (vNDVI) for estimating NDVI values on RGB images utilizing genetic algorithms , 2020, Comput. Electron. Agric..

[4]  Sara Silva,et al.  Improving the Detection of Burnt Areas in Remote Sensing using Hyper-features Evolved by M3GP , 2020, 2020 IEEE Congress on Evolutionary Computation (CEC).

[5]  Bing Xue,et al.  Genetic Programming for Multiple Feature Construction in Skin Cancer Image Classification , 2019, 2019 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[6]  Leonardo Trujillo,et al.  Transfer learning in constructive induction with Genetic Programming , 2019, Genetic Programming and Evolvable Machines.

[7]  Mengjie Zhang,et al.  Genetic programming for multiple-feature construction on high-dimensional classification , 2019, Pattern Recognit..

[8]  M. Temudo,et al.  Petro-Landscapes: Urban Expansion and Energy Consumption in Mbanza Kongo City, Northern Angola , 2019, Human Ecology.

[9]  Yady Tatiana Solano Correa,et al.  A Semi-Supervised Crop-Type Classification Based on Sentinel-2 NDVI Satellite Image Time Series And Phenological Parameters , 2019, IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium.

[10]  Jean-Philippe Poli,et al.  Consistent Feature Construction with Constrained Genetic Programming for Experimental Physics , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[11]  Azuraliza Abu Bakar,et al.  Recent Developments on Evolutionary Computation Techniques to Feature Construction , 2019, ACIIDS.

[12]  Catarina Lopes,et al.  Open-access cloud resources contribute to mainstream REDD+: The case of Mozambique , 2019, Land Use Policy.

[13]  Leonardo Vanneschi,et al.  Evolving multidimensional transformations for symbolic regression with M3GP , 2018, Memetic Comput..

[14]  Leonardo Vanneschi,et al.  Burned area estimations derived from Landsat ETM+ and OLI data: Comparing Genetic Programming with Maximum Likelihood and Classification and Regression Trees , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[15]  Peter A. N. Bosman,et al.  Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors , 2018, GECCO.

[16]  Christopher E. Holden,et al.  Improved mapping of forest type using spectral-temporal Landsat features , 2018, Remote Sensing of Environment.

[17]  Huan Liu,et al.  Feature Engineering for Machine Learning and Data Analytics , 2018 .

[18]  Janne Heiskanen,et al.  Burned area detection based on Landsat time series in savannas of southern Burkina Faso , 2018, Int. J. Appl. Earth Obs. Geoinformation.

[19]  Alemayehu Midekisa,et al.  Mapping land cover change over continental Africa using Landsat and Google Earth Engine cloud computing , 2017, PloS one.

[20]  Justin Morgenroth,et al.  Developments in Landsat Land Cover Classification Methods: A Review , 2017, Remote. Sens..

[21]  Mengjie Zhang,et al.  Genetic programming based feature construction for classification with incomplete data , 2017, GECCO.

[22]  Bambang Trisakti,et al.  TECHNIQUE FOR IDENTIFYING BURNED VEGETATION AREA USING LANDSAT 8 DATA , 2017 .

[23]  Baofeng Su,et al.  Significant Remote Sensing Vegetation Indices: A Review of Developments and Applications , 2017, J. Sensors.

[24]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Mengjie Zhang,et al.  Evolutionary computation for feature manipulation: Key challenges and future directions , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[26]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[27]  Mengjie Zhang,et al.  Genetic programming for feature construction and selection in classification on high-dimensional data , 2016, Memetic Comput..

[28]  R. Taghizadeh‐Mehrjardi,et al.  Prediction of soil surface salinity in arid region of central Iran using auxiliary variables and genetic programming , 2016 .

[29]  Tetsuro Sakai,et al.  Mapping a burned forest area from Landsat TM data by multiple methods , 2016 .

[30]  M. Vasconcelos,et al.  Can blue carbon contribute to clean development in West-Africa? The case of Guinea-Bissau , 2015, Mitigation and Adaptation Strategies for Global Change.

[31]  Carlo Gatta,et al.  Unsupervised Deep Feature Extraction for Remote Sensing Image Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[32]  Kalyan Veeramachaneni,et al.  Building Predictive Models via Feature Synthesis , 2015, GECCO.

[33]  Luis Muñoz,et al.  M3GP - Multiclass Classification with GP , 2015, EuroGP.

[34]  Ioannis B. Theocharis,et al.  Burned Area Mapping Using Support Vector Machines and the FuzCoC Feature Selection Method on VHR IKONOS Imagery , 2014, Remote. Sens..

[35]  Renato Fontes Guimarães,et al.  Spatial Patterns of Fire Recurrence Using Remote Sensing and GIS in the Brazilian Savanna: Serra do Tombador Nature Reserve, Brazil , 2014, Remote. Sens..

[36]  Samina Khalid,et al.  A survey of feature selection and feature extraction techniques in machine learning , 2014, 2014 Science and Information Conference.

[37]  Mengjie Zhang,et al.  Multiple feature construction for effective biomarker identification and classification using genetic programming , 2014, GECCO.

[38]  Stephen Marshall,et al.  Effective Feature Extraction and Data Reduction in Remote Sensing Using Hyperspectral Imaging [Applications Corner] , 2014, IEEE Signal Processing Magazine.

[39]  Ayushi A Survey on Feature Extraction Techniques , 2013 .

[40]  D. Smiraglia,et al.  Land cover data from Landsat single-date imagery: an approach integrating pixel-based and object-based classifiers , 2013 .

[41]  Gloria Bordogna,et al.  A method for extracting burned areas from Landsat TM/ETM+ images by soft aggregation of multiple Spectral Indices and a region growing algorithm , 2012 .

[42]  Peyman Kabiri,et al.  NDVI Optimization Using Genetic Algorithm , 2011, 2011 7th Iranian Conference on Machine Vision and Image Processing.

[43]  Petra Perner,et al.  Machine Learning and Data Mining in Pattern Recognition , 2011, Lecture Notes in Computer Science.

[44]  M. Vasconcelos,et al.  Spatial dynamics and quantification of deforestation in the central-plateau woodlands of Angola (1990–2009) , 2011 .

[45]  E. Chuvieco,et al.  Mapping burned areas from Landsat TM/ETM+ data with a two-phase algorithm: Balancing omission and commission errors , 2011 .

[46]  Alejandro Hinojosa-Corona,et al.  A Genetic Programming Approach to Estimate Vegetation Cover in the Context of Soil Erosion Assessment , 2011 .

[47]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[48]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[49]  N. Chang,et al.  Seasonal change detection of riparian zones with remote sensing images and genetic programming in a semi-arid watershed. , 2009, Journal of environmental management.

[50]  Jacques-André Landry,et al.  A Genetic-Programming-Based Method for Hyperspectral Data Information Extraction: Agricultural Applications , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[51]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[52]  Hugo Carrão,et al.  Contribution of multispectral and multitemporal information from MODIS images to land cover classification , 2008 .

[53]  Mark Johnston,et al.  Feature Construction and Dimension Reduction Using Genetic Programming , 2007, Australian Conference on Artificial Intelligence.

[54]  Ni-Bin Chang,et al.  Soil moisture estimation in a semiarid watershed using RADARSAT‐1 satellite imagery and genetic programming , 2006 .

[55]  José M. C. Pereira,et al.  A land cover map of southern hemisphere Africa based on SPOT‐4 Vegetation data , 2006 .

[56]  Asoke K. Nandi,et al.  Breast Cancer Diagnosis Using Genetic Programming Generated Feature , 2005, 2005 IEEE Workshop on Machine Learning for Signal Processing.

[57]  Ana C. L. Sá,et al.  An estimate of the area burned in southern Africa during the 2000 dry season using SPOT-VEGETATION satellite data , 2003 .

[58]  Krzysztof Krawiec,et al.  Coevolutionary Feature Learning for Object Recognition , 2003, MLDM.

[59]  L. Chen,et al.  A study of applying genetic programming to reservoir trophic state evaluation using remote sensor data , 2003 .

[60]  Krzysztof Krawiec,et al.  Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks , 2002, Genetic Programming and Evolvable Machines.

[61]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[62]  John Weier and David Herring Measuring Vegetation (NDVI & EVI) : Feature Articles , 2000 .

[63]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[64]  S. K. McFeeters The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features , 1996 .

[65]  Peng Gong,et al.  A comparison of spatial feature extraction algorithms for land-use classification with SPOT HRV data , 1992 .

[66]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[67]  Mengjie Zhang,et al.  Genetic Programming with Embedded Feature Construction for High-Dimensional Symbolic Regression , 2017 .

[68]  Amir Hossein Alavi,et al.  Machine learning in geosciences and remote sensing , 2016 .

[69]  Trent McConaghy,et al.  FFX: Fast, Scalable, Deterministic Symbolic Regression Technology , 2011 .

[70]  Parikshit Sondhi,et al.  Feature Construction Methods : A Survey , 2009 .

[71]  N. Benson,et al.  Landscape Assessment: Ground measure of severity, the Composite Burn Index; and Remote sensing of severity, the Normalized Burn Ratio , 2006 .

[72]  Krzysztof Krawiec,et al.  Coevolutionary feature construction for transformation of representation of machine learners , 2004, Intelligent Information Systems.

[73]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .