PLS2 in Metabolomics

Metabolomics is the systematic study of the small-molecule profiles of biological samples produced by specific cellular processes. The high-throughput technologies used in metabolomic investigations generate datasets where variables are strongly correlated and redundancy is present in the data. Discovering the hidden information is a challenge, and suitable approaches for data analysis must be employed. Projection to latent structures regression (PLS) has successfully solved a large number of problems, from multivariate calibration to classification, becoming a basic tool of metabolomics. PLS2 is the most used implementation of PLS. Despite its success, PLS2 showed some limitations when the so called ‘structured noise’ affects the data. Suitable methods have been recently introduced to patch up these limitations. In this study, a comprehensive and up-to-date presentation of PLS2 focused on metabolomics is provided. After a brief discussion of the mathematical framework of PLS2, the post-transformation procedure is introduced as a basic tool for model interpretation. Orthogonally-constrained PLS2 is presented as strategy to include constraints in the model according to the experimental design. Two experimental datasets are investigated to show how PLS2 and its improvements work in practice.

[1]  A. Phatak,et al.  The geometry of partial least squares , 1997 .

[2]  S. Wold,et al.  INLR, implicit non‐linear latent variable regression , 1997 .

[3]  David Di Ruscio,et al.  A weighted view on the partial least-squares algorithm , 2000, Autom..

[4]  E. Saccenti Correlation Patterns in Experimental Data Are Affected by Normalization Procedures: Consequences for Data Analysis and Network Inference. , 2017, Journal of proteome research.

[5]  Helena U Zacharias,et al.  Statistical Analysis of NMR Metabolic Fingerprints: Established Methods and Recent Advances , 2018, Metabolites.

[6]  El Mostafa Qannari,et al.  Principal component regression, ridge regression and ridge principal component regression in spectroscopy calibration , 1997 .

[7]  Svante Wold,et al.  Personal memories of the early PLS development , 2001 .

[8]  S. Wold,et al.  The GIFI approach to non‐linear PLS modeling , 2001 .

[9]  Douglas B. Kell,et al.  Proposed minimum reporting standards for data analysis in metabolomics , 2007, Metabolomics.

[10]  E. Baraldi,et al.  Metabolomics reveals new metabolic perturbations in children with type 1 diabetes , 2018, Pediatric diabetes.

[11]  Harald Martens,et al.  Reliable and relevant modelling of real world data: a personal account of the development of PLS Regression , 2001 .

[12]  M. Rantalainen,et al.  Kernel‐based orthogonal projections to latent structures (K‐OPLS) , 2007 .

[13]  Matteo Stocchero,et al.  A 1H NMR metabolomic approach for the estimation of the time since death using aqueous humour: an animal model , 2019, Metabolomics.

[14]  B. Balkau,et al.  Metabolomic Profile of Low–Copy Number Carriers at the Salivary α-Amylase Gene Suggests a Metabolic Shift Toward Lipid-Based Energy Production , 2016, Diabetes.

[15]  Olav M. Kvalheim Interpretation of partial least squares regression models by means of target projection and selectivity ratio plots , 2010 .

[16]  Rolf Ergon PLS post‐processing by similarity transformation (PLS + ST): a simple alternative to OPLS , 2005 .

[17]  A. Höskuldsson PLS regression methods , 1988 .

[18]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[19]  S. Wold,et al.  The multivariate calibration problem in chemistry solved by the PLS method , 1983 .

[20]  Hugo Kubinyi,et al.  3D QSAR in drug design : theory, methods and applications , 2000 .

[21]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[22]  L. Buydens,et al.  Kernel-Partial Least Squares regression coupled to pseudo-sample trajectories for the analysis of mixture designs of experiments , 2018 .

[23]  Leslie R Euceda,et al.  Preprocessing of NMR metabolomics data , 2015, Scandinavian journal of clinical and laboratory investigation.

[24]  Age K. Smilde,et al.  The geometry of ASCA , 2008 .

[25]  Samantha Riccadonna,et al.  Projection to latent structures with orthogonal constraints for metabolomics data , 2018 .

[26]  Alison J. Burnham,et al.  Frameworks for latent variable multivariate regression , 1996 .

[27]  R. Manne Analysis of two partial-least-squares algorithms for multivariate calibration , 1987 .

[28]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[29]  D. Ballabio,et al.  Classification tools in chemistry. Part 1: linear models. PLS-DA , 2013 .

[30]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[31]  Post‐transformation of PLS2 (ptPLS2) by orthogonal matrix: a new approach for generating predictive and orthogonal latent variables , 2016 .

[32]  Tarja Rajalahti,et al.  X‐tended target projection (XTP)—comparison with orthogonal partial least squares (OPLS) and PLS post‐processing by similarity transformation (PLS + ST) , 2009 .

[33]  Tarja Rajalahti,et al.  Discriminating variable test and selectivity ratio plot: quantitative tools for interpretation and variable (biomarker) selection in complex spectral or chromatographic profiles. , 2009, Analytical chemistry.

[34]  S. Wold,et al.  The kernel algorithm for PLS , 1993 .

[35]  I. Helland ON THE STRUCTURE OF PARTIAL LEAST SQUARES REGRESSION , 1988 .

[36]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[37]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[38]  Ozgur Yeniay,et al.  A comparison of partial least squares regression with other prediction methods , 2001 .

[39]  Runmin Wei,et al.  Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data , 2018, Scientific Reports.

[40]  D. Massart,et al.  Elimination of uninformative variables for multivariate calibration. , 1996, Analytical chemistry.

[41]  M. Stocchero Exploring the latent variable space of PLS2 by post‐transformation of the score matrix (ptLV) , 2018, Journal of Chemometrics.