Comparative analysis of targeted metabolomics: Dominance-based rough set approach versus orthogonal partial least square-discriminant analysis

BACKGROUND Metabolomics is an emerging field that includes ascertaining a metabolic profile from a combination of small molecules, and which has health applications. Metabolomic methods are currently applied to discover diagnostic biomarkers and to identify pathophysiological pathways involved in pathology. However, metabolomic data are complex and are usually analyzed by statistical methods. Although the methods have been widely described, most have not been either standardized or validated. Data analysis is the foundation of a robust methodology, so new mathematical methods need to be developed to assess and complement current methods. We therefore applied, for the first time, the dominance-based rough set approach (DRSA) to metabolomics data; we also assessed the complementarity of this method with standard statistical methods. Some attributes were transformed in a way allowing us to discover global and local monotonic relationships between condition and decision attributes. We used previously published metabolomics data (18 variables) for amyotrophic lateral sclerosis (ALS) and non-ALS patients. RESULTS Principal Component Analysis (PCA) and Orthogonal Partial Least Square-Discriminant Analysis (OPLS-DA) allowed satisfactory discrimination (72.7%) between ALS and non-ALS patients. Some discriminant metabolites were identified: acetate, acetone, pyruvate and glutamine. The concentrations of acetate and pyruvate were also identified by univariate analysis as significantly different between ALS and non-ALS patients. DRSA correctly classified 68.7% of the cases and established rules involving some of the metabolites highlighted by OPLS-DA (acetate and acetone). Some rules identified potential biomarkers not revealed by OPLS-DA (beta-hydroxybutyrate). We also found a large number of common discriminating metabolites after Bayesian confirmation measures, particularly acetate, pyruvate, acetone and ascorbate, consistent with the pathophysiological pathways involved in ALS. CONCLUSION DRSA provides a complementary method for improving the predictive performance of the multivariate data analysis usually used in metabolomics. This method could help in the identification of metabolites involved in disease pathogenesis. Interestingly, these different strategies mostly identified the same metabolites as being discriminant. The selection of strong decision rules with high value of Bayesian confirmation provides useful information about relevant condition-decision relationships not otherwise revealed in metabolomics data.

[1]  J. Trygg,et al.  Evaluation of the orthogonal projection on latent structure model limitations caused by chemical shift variability and improved visualization of biomarker changes in 1H NMR spectroscopic metabonomic studies. , 2005, Analytical chemistry.

[2]  A Ralph Henderson,et al.  The bootstrap: a technique for data-driven statistics. Using computer-intensive analyses to explore experimental data. , 2005, Clinica chimica acta; international journal of clinical chemistry.

[3]  G. Siuzdak,et al.  Innovation: Metabolomics: the apogee of the omics trilogy , 2012, Nature Reviews Molecular Cell Biology.

[4]  Salvatore Greco,et al.  Rough sets theory for multicriteria decision analysis , 2001, Eur. J. Oper. Res..

[5]  Subramaniam Pennathur,et al.  Analytical approaches to metabolomics and applications to systems biology. , 2010, Seminars in nephrology.

[6]  Age K. Smilde,et al.  Multivariate paired data analysis: multilevel PLSDA versus OPLSDA , 2009, Metabolomics.

[7]  E. K. Kemsley,et al.  Multivariate techniques and their application in nutrition: a metabolomics case study , 2007, British Journal of Nutrition.

[8]  T. Hankemeier,et al.  Quantitative metabolomics based on gas chromatography mass spectrometry: status and perspectives , 2010, Metabolomics.

[9]  Jerzy Stefanowski,et al.  Feature Set-based Consistency Sampling in Bagging Ensembles , 2009 .

[10]  Arvind Ramanathan,et al.  A plasma signature of human mitochondrial disease revealed through metabolic profiling of spent media from cultured muscle cells , 2010, Proceedings of the National Academy of Sciences.

[11]  Johan Trygg,et al.  Chemometrics in metabolomics--a review in human disease diagnosis. , 2010, Analytica chimica acta.

[12]  Salvatore Greco,et al.  Monotonic Variable Consistency Rough Set Approaches , 2009, Int. J. Approx. Reason..

[13]  Patrick Emond,et al.  Correction: 1H-NMR-Based Metabolomic Profiling of CSF in Early Amyotrophic Lateral Sclerosis , 2010, PLoS ONE.

[14]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[15]  Rima Kaddurah-Daouk,et al.  Metabolomics tools for identifying biomarkers for neuropsychiatric diseases , 2009, Neurobiology of Disease.

[16]  Salvatore Greco,et al.  Inductive discovery of laws using monotonic rules , 2012, Eng. Appl. Artif. Intell..

[17]  Jens Stoye,et al.  Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets , 2012, BMC Bioinformatics.

[18]  E. Johansson,et al.  Building multivariate systems biology models. , 2012, Analytical chemistry.

[19]  Didier Sornette,et al.  Encyclopedia of Complexity and Systems Science , 2009 .

[20]  Patrick Emond,et al.  1H-NMR-Based Metabolomic Profiling of CSF in Early Amyotrophic Lateral Sclerosis , 2010, PloS one.

[21]  Patrick Emond,et al.  Metabolomics in cerebrospinal fluid of patients with amyotrophic lateral sclerosis: an untargeted approach via high-resolution mass spectrometry. , 2013, Journal of proteome research.

[22]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[23]  Roman Slowinski,et al.  Variable Consistency Bagging Ensembles , 2010, Trans. Rough Sets.

[24]  Julie D. Atkin,et al.  Redox Regulation in Amyotrophic Lateral Sclerosis , 2013, Oxidative medicine and cellular longevity.

[25]  Salvatore Greco,et al.  Rough Set Based Decision Support , 2005 .

[26]  Salvatore Greco,et al.  Rough Sets in Decision Making , 2009, Encyclopedia of Complexity and Systems Science.

[27]  Roman Słowiński,et al.  Sequential covering rule induction algorithm for variable consistency rough set approaches , 2011, Inf. Sci..

[28]  Lucette Lacomblez,et al.  Impaired glucose tolerance in patients with amyotrophic lateral sclerosis , 2010, Amyotrophic lateral sclerosis : official publication of the World Federation of Neurology Research Group on Motor Neuron Diseases.

[29]  Salvatore Greco,et al.  Rough sets methodology for sorting problems in presence of multiple attributes and criteria , 2002, Eur. J. Oper. Res..

[30]  Philip M. Beart,et al.  Combined excitotoxic–oxidative stress and the concept of non-cell autonomous pathology of ALS: Insights into motoneuron axonopathy and astrogliosis , 2012, Neurochemistry International.

[31]  Roman Slowinski,et al.  Rule-Based Estimation of Attribute Relevance , 2011, RSKT.

[32]  Erik Johansson,et al.  Strategy for optimizing LC-MS data processing in metabolomics: a design of experiments approach. , 2012, Analytical chemistry.

[33]  Alok Kumar,et al.  Metabolomic analysis of serum by (1) H NMR spectroscopy in amyotrophic lateral sclerosis. , 2010, Clinica chimica acta; international journal of clinical chemistry.

[34]  B. Le Bizec,et al.  Implementation of a semi-automated strategy for the annotation of metabolomic fingerprints generated by liquid chromatography-high resolution mass spectrometry from biological samples. , 2012, The Analyst.