SPORT pre-processing can improve near-infrared quality prediction models for fresh fruits and agro-materials

Abstract Near-infrared spectroscopy (NIRS) is a key non-destructive technique for rapid assessment of the chemical properties of food materials. However, a major challenge with NIRS is the mixed physicochemical phenomena captured by the interaction of the light with the matter. The interaction often results in both absorption and scattering of the light. The overall NIRS signal therefore contains information related to the two phenomena mixed. To predict chemical properties such as dry matter, Brix and lipids, light refelction/absorption is used. Therefore, when the aim of the data analysis is to predict chemical components, it is necessary to remove as much as possible the scattering effects from the spectra. Several pre-processing techniques are available to do this, but it is often difficult to decide which one to choose. In this article we present the use of a recently developed pre-processing approach, sequential pre-processing through orthogonalization (SPORT), to improve the predictive power of multivariate models based on NIR spectra of food materials. The SPORT approach utilizes sequential orthogonalized partial least square regression (SOPLS) for the fusion of data blocks corresponding to several spectral preprocessing techniques. The results were compared with commonly used pre-processing techniques in the analysis of food materials by NIRS. The comparison was made by analyzing 5 different datasets comprised of apples, apricots, olive oils and grapes associated with chemical properties such as dry matter (DM), Brix, lipids and citric acid. The datasets were from both reflection and transmission measurements. The results showed that the fusion-based pre-processing methodology is an ideal choice for pre-processing of NIRS data. For four out of five datasets, the prediction accuracies (high R2pred and low RMSEP) were improved. The improvement led to as much as a 20 % increase in R2pred and a 25 % decrease in RMSEP compared to the standard 2nd derivative pre-processing. The pre-processing fusion was more effective for the reflection mode compared to the transmission mode. Multiple pre-processing techniques provided complementary information, and therefore, their fusion using the SPORT approach improved the model performance. The methodology is not only applicable to food materials but can in fact be used as a general pre-processing approach for all types of modeling of spectral data.

[1]  C. Pasquini Near infrared spectroscopy: A mature analytical technique with new perspectives - A review. , 2018, Analytica chimica acta.

[2]  M. De La Guardia,et al.  The Use of Near-Infrared Spectrometry in the Olive Oil Industry , 2010, Critical reviews in food science and nutrition.

[3]  Kerry B. Walsh,et al.  The uses of near infra-red spectroscopy in postharvest decision support: A review , 2020 .

[4]  Hai-Long Wu,et al.  Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration. , 2008, Analytica chimica acta.

[5]  Umezuruike Linus Opara,et al.  Non-destructive prediction of internal and external quality attributes of fruit with thick rind: A review , 2018 .

[6]  K Osoro,et al.  Prediction of the fatty acid composition of beef by near infrared transmittance spectroscopy. , 2008, Meat science.

[7]  Alessandra Biancolillo,et al.  Multi-block classification of Italian semolina based on Near Infrared Spectroscopy (NIR) analysis and alveographic indices. , 2020, Food chemistry.

[8]  Rasmus Bro,et al.  Variable selection in multi-block regression , 2016 .

[9]  D. Bertrand,et al.  Application of ATR-FTIR for a rapid and simultaneous determination of sugars and organic acids in apricot fruit , 2009 .

[10]  A. Peirs,et al.  Light penetration properties of NIR radiation in fruit with respect to non-destructive quality assessment , 2000 .

[11]  Wouter Saeys,et al.  Measurement of optical properties of fruits and vegetables: A review , 2020 .

[12]  Rasmus Bro,et al.  Extension of SO-PLS to multi-way arrays: SO-N-PLS , 2017 .

[13]  J. Roger,et al.  EPO–PLS external parameter orthogonalisation of PLS application to temperature-independent measurement of sugar content of intact fruits , 2003 .

[14]  H. Martens,et al.  Light scattering and light absorbance separated by extended multiplicative signal correction. application to near-infrared transmission analysis of powder mixtures. , 2003, Analytical chemistry.

[15]  Jean-Michel Roger,et al.  Sequential preprocessing through ORThogonalization (SPORT) and its application to near infrared spectroscopy , 2020 .

[16]  J. Roger,et al.  Multi-block SO-PLS approach based on infrared spectroscopy for anaerobic digestion process monitoring , 2020 .

[17]  Desire L. Massart,et al.  The robust normal variate transform for pattern recognition with near-infrared data , 1999 .

[18]  Bart Nicolai,et al.  Multivariate calibration of spectroscopic sensors for postharvest quality evaluation: A review , 2019 .

[19]  Jan Gerretzen,et al.  Simple and Effective Way for Data Preprocessing Selection Based on Design of Experiments. , 2015, Analytical chemistry.

[20]  Guojun Zhou,et al.  A local pre-processing method for near-infrared spectra, combined with spectral segmentation and standard normal variate transformation. , 2016, Analytica chimica acta.

[21]  Hailong Wang,et al.  Fruit Quality Evaluation Using Spectroscopy Technology: A Review , 2015, Sensors.

[22]  Age K. Smilde,et al.  Separating common (global and local) and distinct variation in multiple mixed types data sets , 2019, Journal of Chemometrics.

[23]  Rasmus Bro,et al.  Common and distinct components in data fusion , 2016, 1607.02328.

[24]  Lorenzo Cerretani,et al.  Chemometric applications to assess quality and critical parameters of virgin and extra-virgin olive oil. A review. , 2016, Analytica chimica acta.

[25]  Xudong Sun,et al.  NIRS prediction of dry matter content of single olive fruit with consideration of variable sorting for normalisation pre-treatment , 2020 .

[26]  A. Peirs,et al.  Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: A review , 2007 .

[27]  T. Næs,et al.  Application of sequential and orthogonalised-partial least squares (SO-PLS) regression to predict sensory properties of Cabernet Sauvignon wines from grape chemical composition. , 2018, Food chemistry.

[28]  Age K. Smilde,et al.  Performance of methods that separate common and distinct variation in multiple data blocks , 2018, Journal of Chemometrics.

[29]  Beata Walczak,et al.  VSN: Variable sorting for normalization , 2020 .

[30]  Frans van den Berg,et al.  Review of the most common pre-processing techniques for near-infrared spectra , 2009 .

[31]  L. A. Stone,et al.  Computer Aided Design of Experiments , 1969 .

[32]  Lutgarde M. C. Buydens,et al.  Breaking with trends in pre-processing? , 2013 .

[33]  Fei Zhang,et al.  A selective ensemble preprocessing strategy for near-infrared spectral quantitative analysis of complex samples , 2020 .

[34]  S. Oshita,et al.  Non-invasive analytical technology for the detection of contamination, adulteration, and authenticity of meat, poultry, and fish: a review. , 2015, Analytica chimica acta.

[35]  Jean-Michel Roger,et al.  Pre-processing Methods , 2020 .

[36]  J. Kister,et al.  Geographic origins and compositions of virgin olive oils determinated by chemometric analysis of NIR spectra. , 2007, Analytica chimica acta.

[37]  T. Næs,et al.  The Effect of Multiplicative Scatter Correction (MSC) and Linearity Improvement in NIR Spectroscopy , 1988 .

[38]  R. Barnes,et al.  Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra , 1989 .

[39]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.