The impact of signal pre-processing on the final interpretation of analytical outcomes - A tutorial.

The present tutorial paper is aimed to analyse and critically discuss the consequences of row pre-processing (conversion of measurement units, derivatives, and standard normal variate transform) on the evaluation of final outcomes of chemometric data analysis. An in-depth focus on pre-processing effects both on the signal shape and on misinterpretation of results - a crucial and disregarded issue in the analytical field - is presented. It is shown how this preliminary step of data processing may lead, in many cases, to draw incongruous conclusions, not actually based on real information embodied within data, but on artefacts arising from the mathematical transforms. This tutorial is not limited to a description of the problem, it also introduces strategies and tools for overcoming such unwanted effects, allowing a direct interpretation of the importance of original variables to be performed, explaining the chemical information that actually characterises samples. The dangerous implications of row pre-processing on instrumental signals is demonstrated on real datasets coming from different analytical techniques: transmission and attenuated total reflection infrared spectroscopy, cyclic voltammetry, X-ray fluorescence spectroscopy, Raman spectroscopy, and ultraviolet-visible spectroscopy. Hence, the impact of this widespread problem in most of the branches of analytical chemistry is illustrated.

[1]  Francesco Savorani,et al.  icoshift: An effective tool for the alignment of chromatographic data. , 2011, Journal of chromatography. A.

[2]  R. Frost,et al.  A Raman spectroscopic comparison of calcite and dolomite. , 2014, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[3]  V. Pirro,et al.  Interactive hyperspectral approach for exploring and interpreting DESI-MS images of cancerous and normal tissue sections. , 2012, The Analyst.

[4]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[5]  M. Hendriks,et al.  Variable Shift and Alignment , 2020, Comprehensive Chemometrics.

[6]  R. Spang,et al.  State-of-the art data normalization methods improve NMR-based metabolomic analysis , 2011, Metabolomics.

[7]  Tom Fearn The Effect of Spectral Pre-Treatments on Interpretation , 2009 .

[8]  Jan Gerretzen,et al.  Simple and Effective Way for Data Preprocessing Selection Based on Design of Experiments. , 2015, Analytical chemistry.

[9]  R. Barnes,et al.  Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra , 1989 .

[10]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.

[11]  P. P. Lottici,et al.  Raman modes in Pbca enstatite (Mg2Si2O6): an assignment by quantum mechanical calculation to interpret experimental results , 2016 .

[12]  Lutgarde M. C. Buydens,et al.  Breaking with trends in pre-processing? , 2013 .

[13]  B Walczak,et al.  What can go wrong at the data normalization step for identification of biomarkers? , 2014, Journal of chromatography. A.

[14]  S. Rutan,et al.  Denoising and Signal-to-Noise Ratio Enhancement: Classical Filtering , 2009 .

[15]  R. H. Jellema,et al.  2.06 – Variable Shift and Alignment , 2009 .

[16]  E. K. Kemsley,et al.  Mid-infrared spectroscopy and authenticity problems in selected meats: a feasibility study , 1997 .

[17]  V.-M. Taavitsainen,et al.  Denoising and Signal-to-Noise Ratio Enhancement: Derivatives , 2009 .

[18]  V. Segtnan,et al.  Standard Normal Variate, Multiplicative Signal Correction and Extended Multiplicative Signal Correction Preprocessing in Biospectroscopy , 2009 .

[19]  Jan Gerretzen,et al.  Boosting model performance and interpretation by entangling preprocessing selection and variable selection. , 2016, Analytica chimica acta.

[20]  A. Iversen,et al.  Multiplicative Scatter Correction of Visible Reflectance Spectra in Color Determination of Meat Surfaces , 1985 .

[21]  P. Ugo,et al.  Electrochemical immunosensor based on ensemble of nanoelectrodes for immunoglobulin IgY detection: application to identify hen's egg yolk in tempera paintings. , 2014, Biosensors & bioelectronics.

[22]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[23]  V. Pawlowsky-Glahn,et al.  Compositional data and their analysis: an introduction , 2006, Geological Society, London, Special Publications.

[24]  J. Carstensen,et al.  Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping , 1998 .

[25]  C. Geiger,et al.  A Raman spectroscopic study of Fe–Mg olivines , 2004 .

[26]  Novel calibration model maintenance strategy for solving the signal instability in quantitative liquid chromatography-mass spectrometry. , 2014, Journal of chromatography. A.

[27]  S. Gunasekaran,et al.  Raman and infrared spectra of carbonates of calcite structure , 2006 .

[28]  Pedro M. Saraiva,et al.  Denoising and Signal-to-Noise Ratio Enhancement: Wavelet Transform and Fourier Transform , 2009 .