Independent Component Analysis and Statistical Modelling for the Identification of Metabolomics Biomarkers in 1H-NMR Spectroscopy

In order to maintain life, living organism’s product and transform small molecules called metabolites. Metabolomics aims at studying the development of biological reactions resulting from a contact with a physio-pathological stimulus, through these metabolites. The 1H-NMR spectroscopy is widely used to graphically describe a metabolite composition via spectra. Biologists can then confirm or invalidate the development of a biological reaction if specific NMR spectral regions are altered from a given physiological situation to another. However, this pro-cess supposes a preliminary identification step which traditionally consists in the study of the two first components of a Principal Component Analysis (PCA). This paper presents a new methodology in two main steps providing knowledge on specific 1H-NMR spectral areas via the identification of biomarkers and via the visualization of the effects caused by some external changes. The first step implies Independent Component Analysis (ICA) in order to decompose the spectral data into statistically independent components or sources of information. The in-dependent (pure or composite) metabolites contained in bio fluids are discovered through the sources, and their quantities through mixing weights. Specific questions related to ICA like the choice of the number of components and their ordering are discussed. The second step consists in a statistical modelling of the ICA mixing weights and introduces statistical hypothesis tests on the parameters of the estimated models, with the objective of selecting sources which present biomarkers (or significantly fluctuating spectral regions). Statistical models are considered here for their adaptability to different possible kinds of data or contexts. A computation of contrasts which can lead to the visualization of changes on spectra caused by changes of the factor of interest is also proposed. This methodology is innovative because multi-factors studies (via the use of mixed models) and statistical confirmations of the factors effects are allowed together. Citation: Féraud B, Rousseau R, de Tullio P, Verleysen M, Govaerts B (2017) Independent Component Analysis and Statistical Modelling for the Identification of Metabolomics Biomarkers in 1H-NMR Spectroscopy. J Biom Biostat 8: 367. doi: 10.4172/2155-6180.1000367

[1]  P. Tullio Biomedical application of NMR metabolomics: study of Age-related Macular Degeneration (AMD) , 2012 .

[2]  Michel Verleysen,et al.  Comparison of some chemometric tools for metabonomics biomarker identification , 2008 .

[3]  J. Rakic,et al.  Anti-angiogenic therapy of exudative age-related macular degeneration: current progress and emerging concepts. , 2007, Trends in molecular medicine.

[4]  S. Homan,et al.  APPLIED MIXED MODELS IN MEDICINE, 2ND ED. , 2007 .

[5]  E. Świętochowska,et al.  Changes in lipid metabolism in women with age-related macular degeneration , 2005, Clinical and Experimental Medicine.

[6]  S. Batzoglou,et al.  Application of independent component analysis to microarrays , 2003, Genome Biology.

[7]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[8]  D. Bates,et al.  Mixed-Effects Models in S and S-PLUS , 2001 .

[9]  Helen Brown,et al.  Applied Mixed Models in Medicine , 2000, Technometrics.

[10]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[11]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[12]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[13]  S. R. Searle,et al.  Dispersion Matrices for Variance Components Models , 1979 .

[14]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[15]  Robert Powers,et al.  Negative impact of noise on the principal component analysis of NMR data. , 2006, Journal of magnetic resonance.

[16]  Joachim Selbig,et al.  Metabolite fingerprinting: detecting biological features by independent component analysis , 2004, Bioinform..

[17]  Wolfram Liebermeister,et al.  Linear modes of gene expression determined by independent component analysis , 2002, Bioinform..

[18]  H. Friebolin,et al.  Basic one- and two-dimensional NMR spectroscopy , 1991 .