Value of dimensionality reduction for crop differentiation with multi-temporal imagery and machine learning

Abstract This study evaluates the use of automated and manual feature selection – prior to machine learning – for the differentiation of crops in a Mediterranean climate (Western Cape, South Africa). Five Landsat-8 images covering the different crop class phenological stages were acquired and used to generate a range of spectral and textural features within an object-based image analysis (OBIA) paradigm. The features were used as input to decision trees (DTs), k-nearest neighbour (k-NN), support vector machine (SVM), and random forest (RF) supervised classifiers. Testing was done by performing classifications (using all spatial variables) and then incrementally reducing the feature counts (based on importance allocated to features by filters), feature extraction, and manual (semantic) feature selection. Classification and regression trees (CART) and RF were used as methods to filter feature selection. Feature-extraction methods employed include principal components analysis (PCA) and Tasselled cap transformation (TCT). The classification results were analysed by comparing the overall accuracies and kappa coefficients of each scenario, while McNemar’s test was used to assess the statistical significance of differences in accuracies among classifiers. Feature selection was found to improve the overall accuracies of the DT, k-NN, and RF classifications, but reduced the accuracy of SVM. The results showed that SVM with feature extraction (PCA) on individual image dates produced the most accurate classification (96.2%). Semantic groupings of features for classification also revealed that using the image bands and indices is not sufficient for crop classification, and that additional features are needed. The accuracy differences of the classifiers were, however, not statistically significant, which suggests that, although dimensionality reduction can improve crop differentiation when multi-temporal Landsat-8 imagery is used, it had a marginal effect on the results. For operational crop-type classification in the study area (and similar regions), we conclude that the SVM algorithm can be applied to the full set of features generated.

[1]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[2]  V. Pisarevsky,et al.  Intel's Computer Vision Library: applications in calibration, stereo segmentation, tracking, gesture, face and object recognition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[3]  Francisca López-Granados,et al.  Object- and pixel-based analysis for mapping crops and their agro-environmental associated measures using QuickBird imagery , 2009 .

[4]  Yelena Ogneva-Himmelberger,et al.  A comparison of support vector machines and manual change detection for land-cover map updating in Massachusetts, USA , 2013 .

[5]  Venkatesh Saligrama,et al.  Learning Efficient Anomaly Detectors from K-NN Graphs , 2015, AISTATS.

[6]  J. C. Taylor,et al.  The application of time-series MODIS NDVI profiles for the acquisition of crop information across Afghanistan , 2014 .

[7]  Lorenzo Bruzzone,et al.  Kernel-Based Domain-Invariant Feature Selection in Hyperspectral Images for Transfer Learning , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[8]  J. Campbell Introduction to remote sensing , 1987 .

[9]  R. Congalton,et al.  Optimal Land Cover Mapping and Change Analysis in Northeastern Oregon Using Landsat Imagery , 2015 .

[10]  Dirk Tiede,et al.  ESP: a tool to estimate scale parameter for multiresolution image segmentation of remotely sensed data , 2010, Int. J. Geogr. Inf. Sci..

[11]  Janette F. Walde,et al.  Classifiers vs. input variables - The drivers in image classification for land cover mapping , 2009, Int. J. Appl. Earth Obs. Geoinformation.

[12]  Yun Zhang,et al.  A review and comparison of commercially available pan-sharpening techniques for high resolution satellite image fusion , 2012, 2012 IEEE International Geoscience and Remote Sensing Symposium.

[13]  Helmi Zulhaidi Mohd Shafri,et al.  Image Classification in Remote Sensing , 2013 .

[14]  Johannes R. Sveinsson,et al.  Feature extraction for multisource data classification with artificial neural networks , 1997 .

[15]  Steven E. Franklin,et al.  A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-5 HRG imagery , 2012 .

[16]  Qihao Weng,et al.  A survey of image classification methods and techniques for improving classification performance , 2007 .

[17]  Giles M. Foody,et al.  Toward intelligent training of supervised image classifications: directing training data acquisition for SVM classification , 2004 .

[18]  Mathieu Fauvel,et al.  Large-Scale Feature Selection With Gaussian Mixture Models for the Classification of High Dimensional Remote Sensing Images , 2017, IEEE Transactions on Computational Imaging.

[19]  David A Clausi An analysis of co-occurrence texture statistics as a function of grey level quantization , 2002 .

[20]  Helmi Zulhaidi Mohd Shafri,et al.  Road condition assessment by OBIA and feature selection techniques using very high-resolution WorldView-2 imagery , 2017 .

[21]  Nektarios Chrysoulakis,et al.  Landsat 8 vs. Landsat 5: A comparison based on urban and peri-urban land cover mapping , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[22]  Soe W. Myint,et al.  A support vector machine to identify irrigated crop types using time-series Landsat NDVI data , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[23]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[24]  P. Atkinson,et al.  Random Forest classification of Mediterranean land cover using multi-seasonal imagery and multi-seasonal texture , 2012 .

[25]  B. Wardlow,et al.  Analysis of time-series MODIS 250 m vegetation index data for crop classification in the U.S. Central Great Plains , 2007 .

[26]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[27]  Chao-Cheng Wu,et al.  Effects of atmospheric correction and pansharpening on LULC classification accuracy using WorldView-2 imagery , 2015 .

[28]  Adriaan van Niekerk,et al.  Effect of feature dimensionality on object-based land cover classification: A comparison of three classifiers , 2013 .

[29]  Mirijam Gaertner,et al.  Resilience of Invaded Riparian Landscapes: The Potential Role of Soil-Stored Seed Banks , 2014, Environmental Management.

[30]  Christopher Conrad,et al.  Temporal segmentation of MODIS time series for improving crop classification in Central Asian irrigation systems , 2011 .

[31]  John Elder,et al.  Handbook of Statistical Analysis and Data Mining Applications , 2009 .

[32]  Mryka Hall-Beyer,et al.  Practical guidelines for choosing GLCM textures to use in landscape classification tasks over a range of moderate spatial scales , 2017 .

[33]  Li Wang,et al.  Using historical NDVI time series to classify crops at 30m spatial resolution: A case in Southeast Kansas , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[34]  J. Olden,et al.  Redundancy and the choice of hydrologic indices for characterizing streamflow regimes , 2003 .

[35]  Gang Chen,et al.  Multiscale object-based drought monitoring and comparison in rainfed and irrigated agriculture from Landsat 8 OLI imagery , 2016, Int. J. Appl. Earth Obs. Geoinformation.

[36]  Thomas Blaschke,et al.  Object based image analysis for remote sensing , 2010 .

[37]  Jaco Kemp,et al.  Effect of pan-sharpening multi-temporal Landsat 8 imagery for crop type differentiation using different classification techniques , 2017, Comput. Electron. Agric..

[38]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[39]  J. Six,et al.  Object-based crop identification using multiple vegetation indices, textural features and crop phenology , 2011 .

[40]  N. Ramankutty,et al.  Farming the planet: 2. Geographic distribution of crop areas, yields, physiological types, and net primary production in the year 2000 , 2008 .

[41]  Adriaan van Niekerk,et al.  Development of a multi-criteria spatial planning support system for growth potential modelling in the Western Cape, South Africa , 2016 .

[42]  Zhang Xiangmin,et al.  Comparison of pixel‐based and object‐oriented image classification approaches—a case study in a coal fire area, Wuda, Inner Mongolia, China , 2006 .

[43]  Qingxi Tong,et al.  Derivation of a tasselled cap transformation based on Landsat 8 at-satellite reflectance , 2014 .

[44]  Limin Yang,et al.  Derivation of a tasselled cap transformation based on Landsat 7 at-satellite reflectance , 2002 .

[45]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[46]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[47]  P. Gong,et al.  Object-based Detailed Vegetation Classification with Airborne High Spatial Resolution Remote Sensing Imagery , 2006 .