A three-level Multiple-Kernel Learning approach for soil spectral analysis

Abstract To ensure the sustainability of the soil ecosystem, which is the basis for food production, efficient large-scale baseline predictions and trend assessments of key soil properties are necessary. In that regard, visible, near-infrared, and shortwave infrared (VNIR–SWIR) spectroscopy can provide an alternative for the expensive wet chemistry. In this paper, we examined the application of the Multiple-Kernel Learning (MKL) approach to soil spectroscopy by integrating the information from heterogeneous features. In particular, the proposed three-level MKL framework acts in the following way: at the first level, it uses multiple kernels at each spectral feature (wavelength) to maximize the information of each band. At the second level, it performs implicit feature selection at the spectral source level, enabling it to provide interpretable results. Finally, at the third level of integration it combines the complementary information contained within a pool of spectral sources, each derived from its own set of pre-processing techniques. Additionally, at this stage, the proposed approach is also capable of fusing heterogeneous sources of information, such as auxiliary predictors, which can assist the spectral predictions. The experimental analysis was conducted using the pan-European LUCAS (Land Use/Cover Area frame statistical Survey) topsoil database, with a goal to predict from the VNIR–SWIR spectra the concentration of soil organic carbon (SOC), a key indicator for agricultural productivity and environmental resilience. The particle size distribution which describes the soil texture was selected as the set of auxiliary predictors. The proposed MKL framework was compared with other state-of-the-art approaches, and the results indicated that it attains the best performance in terms of accuracy, whilst at the same time producing interpretable results.

[1]  Yu-Chiang Frank Wang,et al.  A Novel Multiple Kernel Learning Framework for Heterogeneous Feature Fusion and Variable Selection , 2012, IEEE Transactions on Multimedia.

[2]  M. Kloft,et al.  l p -Norm Multiple Kernel Learning , 2011 .

[3]  Nikolaos L. Tsakiridis,et al.  Using interpretable fuzzy rule-based models for the estimation of soil organic carbon from VNIR/SWIR spectra and soil texture , 2019, Chemometrics and Intelligent Laboratory Systems.

[4]  E. Ben-Dor Quantitative remote sensing of soil properties , 2002 .

[5]  Ethem Alpaydin,et al.  Localized algorithms for multiple kernel learning , 2013, Pattern Recognit..

[6]  Xiangrong Zhang,et al.  A nonlinear subspace multiple kernel learning for financial distress prediction of Chinese listed companies , 2016, Neurocomputing.

[7]  Frans van den Berg,et al.  Review of the most common pre-processing techniques for near-infrared spectra , 2009 .

[8]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[9]  Guoqing Zhang,et al.  Multiple kernel locality-constrained collaborative representation-based discriminant projection for face recognition , 2018, Neurocomputing.

[10]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[11]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[12]  Nello Cristianini,et al.  On the Extensions of Kernel Alignment , 2002 .

[13]  A. McBratney,et al.  Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy , 2010 .

[14]  Yunqian Ma,et al.  Practical selection of SVM parameters and noise estimation for SVM regression , 2004, Neural Networks.

[15]  Eyal Ben-Dor,et al.  A memory-based learning approach utilizing combined spectral sources and geographical proximity for improved VIS-NIR-SWIR soil properties estimation , 2019, Geoderma.

[16]  Li Yu,et al.  A multi-scale kernel learning method and its application in image classification , 2017, Neurocomputing.

[17]  R. V. Rossel,et al.  Using data mining to model and interpret soil diffuse reflectance spectra. , 2010 .

[18]  Panos Panagos,et al.  Prediction of soil organic carbon content by diffuse reflectance spectroscopy using a local partial least square regression approach , 2014 .

[19]  C. Ballabio,et al.  LUCAS Soil, the largest expandable soil dataset for Europe: a review , 2018 .

[20]  N. Cristianini,et al.  Optimizing Kernel Alignment over Combinations of Kernel , 2002 .

[21]  Budiman Minasny,et al.  A conditioned Latin hypercube method for sampling in the presence of ancillary information , 2006, Comput. Geosci..

[22]  Panos Panagos,et al.  An evolutionary fuzzy rule-based system applied to the prediction of soil organic carbon from soil spectral libraries , 2019, Appl. Soft Comput..

[23]  Ioannis B. Theocharis,et al.  DECO3RUM: A Differential Evolution learning approach for generating compact Mamdani fuzzy rule-based models , 2017, Expert Syst. Appl..

[24]  J. Ross Quinlan,et al.  Combining Instance-Based and Model-Based Learning , 1993, ICML.

[25]  Nikolaos L. Tsakiridis,et al.  A genetic algorithm‐based stacking algorithm for predicting soil organic matter from vis–NIR spectral data , 2019, European Journal of Soil Science.

[26]  Dongyan Zhao,et al.  An overview of kernel alignment and its applications , 2012, Artificial Intelligence Review.

[27]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[28]  J. M. Soriano-Disla,et al.  The Performance of Visible, Near-, and Mid-Infrared Reflectance Spectroscopy for Prediction of Soil Physical, Chemical, and Biological Properties , 2014 .

[29]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[30]  Zhou Shi,et al.  Prediction of soil organic matter using a spatially constrained local partial least squares regression and the Chinese vis–NIR spectral library , 2015 .

[31]  Ioannis B. Theocharis,et al.  An evolutionary fuzzy rule-based system applied to real-world Big Data - the GEO-CRADLE and LUCAS soil spectral libraries , 2018, 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[32]  Yong Dou,et al.  Multiple kernel learning with hybrid kernel alignment maximization , 2017, Pattern Recognit..

[33]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[34]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[35]  Thomas Scholten,et al.  The spectrum-based learner: A new local approach for modeling soil vis–NIR spectra of complex datasets , 2013 .

[36]  S. Wold,et al.  The multivariate calibration problem in chemistry solved by the PLS method , 1983 .

[37]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[38]  Viacheslav I. Adamchuk,et al.  A global spectral library to characterize the world’s soil , 2016 .

[39]  Jon Atli Benediktsson,et al.  A Novel MKL Model of Integrating LiDAR Data and MSI for Urban Area Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[40]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[41]  Rattan Lal,et al.  Mechanisms of Carbon Sequestration in Soil Aggregates , 2004 .

[42]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[43]  Luca Montanarella,et al.  Prediction of Soil Organic Carbon at the European Scale by Visible and Near InfraRed Reflectance Spectroscopy , 2013, PloS one.

[44]  Zhou Shi,et al.  Development of a national VNIR soil-spectral library for soil classification and prediction of organic matter concentrations , 2014, Science China Earth Sciences.

[45]  E. Ben-Dor The reflectance spectra of organic matter in the visible near-infrared and short wave infrared region (400-2500 nm) during a controlled decomposition process , 1997 .

[46]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[47]  J. Baldock,et al.  Role of the soil matrix and minerals in protecting natural organic materials against biological attack , 2000 .

[48]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[49]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[50]  Keith D. Shepherd,et al.  Soil Spectroscopy: An Alternative to Wet Chemistry for Soil Monitoring , 2015 .

[51]  Montanarella Luca,et al.  LUCAS Topoil Survey - methodology, data and results , 2013 .

[52]  Eyal Ben-Dor,et al.  Agricultural Soil Spectral Response and Properties Assessment: Effects of Measurement Protocol and Data Mining Technique , 2017, Remote. Sens..

[53]  Roland Hiederer,et al.  Global soil carbon: understanding and managing the largest terrestrial carbon pool , 2014 .

[54]  Jijun Tang,et al.  Identification of drug-side effect association via multiple information integration with centered kernel alignment , 2019, Neurocomputing.

[55]  Adam Heller,et al.  Efficient p ‐ InP ( Rh ‐ H alloy ) and p ‐ InP ( Re ‐ H alloy ) Hydrogen Evolving Photocathodes , 1982 .

[56]  R. V. Rossel,et al.  Visible and near infrared spectroscopy in soil science , 2010 .

[57]  Eyal Ben-Dor,et al.  Examining the Performance of PARACUDA-II Data-Mining Engine versus Selected Techniques to Model Soil Carbon from Reflectance Spectra , 2018, Remote. Sens..

[58]  K. Shepherd,et al.  Global soil characterization with VNIR diffuse reflectance spectroscopy , 2006 .

[59]  Chiranjib Bhattacharyya,et al.  Variable Sparsity Kernel Learning , 2011, J. Mach. Learn. Res..

[60]  E. T. Elliott Aggregate structure and carbon, nitrogen, and phosphorus in native and cultivated soils , 1986 .

[61]  S. Baxter,et al.  World Reference Base for Soil Resources. World Soil Resources Report 103. Rome: Food and Agriculture Organization of the United Nations (2006), pp. 132, US$22.00 (paperback). ISBN 92-5-10511-4 , 2007, Experimental Agriculture.

[62]  Suresh Venkatasubramanian,et al.  A Unified View of Localized Kernel Learning , 2016, SDM.