Chemometric approaches to improve PLSDA model outcome for predicting human non-alcoholic fatty liver disease using UPLC-MS as a metabolic profiling tool

An MS-based metabolomics strategy including variable selection and PLSDA analysis has been assessed as a tool to discriminate between non-steatotic and steatotic human liver profiles. Different chemometric approaches for uninformative variable elimination were performed by using two of the most common software packages employed in the field of metabolomics (i.e., MATLAB and SIMCA-P). The first considered approach was performed with MATLAB where the PLS regression vector coefficient values were used to classify variables as informative or not. The second approach was run under SIMCA-P, where variable selection was performed according to both the PLS regression vector coefficients and VIP scores. PLSDA models performance features, such as model validation, variable selection criteria, and potential biomarker output, were assessed for comparison purposes. One interesting finding is that variable selection improved the classification predictiveness of all the models by facilitating metabolite identification and providing enhanced insight into the metabolic information acquired by the UPLC-MS method. The results prove that the proposed strategy is a potentially straightforward approach to improve model performance. Among others, GSH, lysophospholipids and bile acids were found to be the most important altered metabolites in the metabolomic profiles studied. However, further research and more in-depth biochemical interpretations are needed to unambiguously propose them as disease biomarkers.

[1]  José Vicente Castell,et al.  A human hepatocellular in vitro model to investigate steatosis. , 2007, Chemico-biological interactions.

[2]  P. Newsome,et al.  Pathogenesis of non-alcoholic fatty liver disease , 2009, QJM : monthly journal of the Association of Physicians.

[3]  Timothy M. D. Ebbels,et al.  Genetic algorithms for simultaneous variable and sample selection in metabonomics , 2009, Bioinform..

[4]  M. Viant,et al.  High-throughput tissue extraction protocol for NMR- and MS-based metabolomics. , 2008, Analytical biochemistry.

[5]  Age K. Smilde,et al.  UvA-DARE ( Digital Academic Repository ) Assessment of PLSDA cross validation , 2008 .

[6]  Kun Wook Chung,et al.  Lysophosphatidylcholine as a death effector in the lipoapoptosis of hepatocytess⃞s⃞ The online version of this article (available at http://www.jlr.org) contains supplementary data in the form of three figures. Published, JLR Papers in Press, October 18, 2007. , 2008, Journal of Lipid Research.

[7]  Hisao Ishibuchi,et al.  Repeated double cross-validation for choosing a single solution in evolutionary multi-objective fuzzy classifier design , 2013, Knowl. Based Syst..

[8]  Richard G. Brereton,et al.  Chemometrics for Pattern Recognition , 2009 .

[9]  Vincent Baeten,et al.  A Backward Variable Selection method for PLS regression (BVSPLS). , 2009, Analytica chimica acta.

[10]  K. Sjoedin Minimizing effects of closure on analytical data , 1984 .

[11]  Christopher Herbert,et al.  Liquid Chromatography/Mass Spectrometry (LC/MS) , 2002 .

[12]  A. Smilde,et al.  Large-scale human metabolomics studies: a strategy for data (pre-) processing and validation. , 2006, Analytical chemistry.

[13]  Jesús Brezmes,et al.  Metabolomic assessment of the effect of dietary cholesterol in the progressive development of fatty liver disease. , 2010, Journal of proteome research.

[14]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[15]  Richard G Brereton,et al.  Variable selection using iterative reformulation of training set models for discrimination of samples: application to gas chromatography/mass spectrometry of mouse urinary metabolites. , 2009, Analytical chemistry.

[16]  Wei Li,et al.  Bile acids metabonomic study on the CCl4- and alpha-naphthylisothiocyanate-induced animal models: quantitative analysis of 22 bile acids by ultraperformance liquid chromatography-mass spectrometry. , 2008, Chemical research in toxicology.

[17]  C. Jun,et al.  Performance of some variable selection methods when multicollinearity is present , 2005 .

[18]  Joshua D. Knowles,et al.  Development of a robust and repeatable UPLC-MS method for the long-term metabolomic study of human serum. , 2009, Analytical chemistry.

[19]  Johan Trygg,et al.  CV‐ANOVA for significance testing of PLS and OPLS® models , 2008 .

[20]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[21]  A. Höskuldsson Variable and subset selection in PLS regression , 2001 .

[22]  B. Efron,et al.  A Leisurely Look at the Bootstrap, the Jackknife, and , 1983 .

[23]  Michelle M Wiest,et al.  The plasma lipidomic signature of nonalcoholic steatohepatitis , 2009, Hepatology.

[24]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[25]  Lennart Eriksson,et al.  Model validation by permutation tests: Applications to variable selection , 1996 .

[26]  Kishore K. Pasikanti,et al.  Noninvasive urinary metabonomic diagnosis of human bladder cancer. , 2010, Journal of proteome research.

[27]  Harold I Feldman,et al.  Plasma metabolomic profiles in different stages of CKD. , 2013, Clinical journal of the American Society of Nephrology : CJASN.

[28]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[29]  Ping Liu,et al.  Urinary metabolomics as a potentially novel diagnostic and stratification tool for knee osteoarthritis , 2010, Metabolomics.

[30]  Shelly C. Lu,et al.  Liquid chromatography-mass spectrometry-based parallel metabolic profiling of human and mouse model serum reveals putative biomarkers associated with the progression of nonalcoholic fatty liver disease. , 2010, Journal of proteome research.

[31]  M. Milburn,et al.  Plasma metabolomic profile in nonalcoholic fatty liver disease. , 2011, Metabolism: clinical and experimental.

[32]  Matej Oresic,et al.  Exploring the lipoprotein composition using Bayesian regression on serum lipidomic profiles , 2007, ISMB/ECCB.

[33]  Sonja Peters,et al.  Parameter selection for peak alignment in chromatographic sample profiling: objective quality indicators and use of control samples , 2009, Analytical and bioanalytical chemistry.

[34]  Paul Geladi,et al.  Principles of Proper Validation: use and abuse of re‐sampling for validation , 2010 .

[35]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[36]  J. Romijn,et al.  Hepatic steatosis: a mediator of the metabolic syndrome. Lessons from animal models. , 2004, Arteriosclerosis, thrombosis, and vascular biology.

[37]  S Bhattacharjee,et al.  Chemometrics in Analytical Chemistry , 1994 .

[38]  A. Sanyal,et al.  Recent advances in nonalcoholic fatty liver disease , 2010, Current opinion in gastroenterology.

[39]  E. Want,et al.  Global metabolic profiling procedures for urine using UPLC–MS , 2010, Nature Protocols.

[40]  A. Sanyal,et al.  Recent advances in nonalcoholic fatty liver disease , 2009, Current opinion in gastroenterology.

[41]  Age K. Smilde,et al.  Discriminant Q2 (DQ2) for improved discrimination in PLSDA models , 2008, Metabolomics.

[42]  J. Lindon,et al.  'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. , 1999, Xenobiotica; the fate of foreign compounds in biological systems.

[43]  J. Castell,et al.  Exploring mass spectrometry suitability to examine human liver graft metabonomic profiles. , 2010, Transplantation proceedings.

[44]  David S. Wishart,et al.  HMDB: a knowledgebase for the human metabolome , 2008, Nucleic Acids Res..