Group aggregating normalization method for the preprocessing of NMR-based metabolomic data

Data normalization plays a crucial role in metabolomics to take into account the inevitable variation in sample concentration and the efficiency of sample preparation procedure. The conventional methods such as constant sum normalization (CSN) and probabilistic quotient normalization (PQN) are widely used, but both methods have their own shortcomings. In the current study, a new data normalization method called group aggregating normalization (GAN) is proposed, by which the samples were normalized so that they aggregate close to their group centers in a principal component analysis (PCA) subspace. This is in contrast with CSN and PQN which rely on a constant reference for all samples. The evaluation of GAN method using both simulated and experimental metabolomic data demonstrated that GAN produces more robust model in the subsequent multivariate data analysis, more superior than both CSN and PQN methods. The current study also demonstrated that some of the differential metabolites identified using the CSN or PQN method could be false positives due to improper data normalization.

[1]  Joachim Selbig,et al.  Metabolite fingerprinting: detecting biological features by independent component analysis , 2004, Bioinform..

[2]  P. Indovina,et al.  A time-domain algorithm for NMR spectral normalization. , 2000, Journal of magnetic resonance.

[3]  Markus Ringnér,et al.  What is principal component analysis? , 2008, Nature Biotechnology.

[4]  Milt Statheropoulos,et al.  Noise reduction of fast, repetitive GC/MS measurements using principal component analysis (PCA) , 1999 .

[5]  Yizeng Liang,et al.  Preprocessing of analytical profiles in the presence of homoscedastic or heteroscedastic noise , 1994 .

[6]  P L Indovina,et al.  A new algorithm for NMR spectral normalization. , 1999, Journal of magnetic resonance.

[7]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[8]  P. Wentzell,et al.  Characterization of the measurement error structure in 1D 1H NMR data for metabolomics studies. , 2009, Analytica chimica acta.

[9]  S. Cai,et al.  Identification of biochemical changes in lactovegetarian urine using 1H NMR spectroscopy and pattern recognition , 2010, Analytical and bioanalytical chemistry.

[10]  J. Lindon,et al.  Scaling and normalization effects in NMR spectroscopic metabonomic data sets. , 2006, Analytical chemistry.

[11]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.

[12]  J. Nicholson,et al.  Rapid and noninvasive diagnosis of the presence and severity of coronary heart disease using 1H-NMR-based metabonomics , 2002, Nature Medicine.

[13]  Matej Oresic,et al.  Normalization method for metabolomics data using optimal selection of multiple internal standards , 2007, BMC Bioinformatics.

[14]  S. Cai,et al.  Metabonomics studies of intact hepatic and renal cortical tissues from diabetic db/db mice using high-resolution magic-angle spinning 1H NMR spectroscopy , 2009, Analytical and bioanalytical chemistry.

[15]  Hugo Kubinyi,et al.  3D QSAR in drug design : theory, methods and applications , 2000 .

[16]  J. Lindon,et al.  NMR‐based metabonomic approaches for evaluating physiological influences on biofluid composition , 2005, NMR in biomedicine.

[17]  R. J. O. Torgrip,et al.  A note on normalization of biofluid 1D 1H-NMR data , 2008, Metabolomics.

[18]  Sabine Van Huffel,et al.  A subspace time-domain algorithm for automated NMR spectral normalization. , 2002, Journal of magnetic resonance.

[19]  Herman Wold,et al.  Soft modelling: The Basic Design and Some Extensions , 1982 .

[20]  Stephanie S O'Malley,et al.  Correction of urine cotinine concentrations for creatinine excretion: is it useful? , 2003, Clinical chemistry.

[21]  Ralf J. O. Torgrip,et al.  A solution to the 1D NMR alignment problem using an extended generalized fuzzy Hough transform and mode support , 2009, Analytical and bioanalytical chemistry.

[22]  S. Wold,et al.  PLS: Partial Least Squares Projections to Latent Structures , 1993 .