Characterization of the measurement error structure in 1D 1H NMR data for metabolomics studies.

NMR-based metabolomics is characterized by high throughput measurements of the signal intensities of complex mixtures of metabolites in biological samples by assaying, typically, bio-fluids or tissue homogenates. The ultimate goal is to obtain relevant biological information regarding the dissimilarity in patho-physiological conditions that the samples experience. For a long time now, this information has been obtained through the analysis of measured NMR signals via multivariate statistics. NMR data are quite complex and the use of such multivariate statistical methods as principal components analysis (PCA) for their analysis assumes that the data are multivariate normal with errors that are identical, independent and normally distributed (i.e. iid normal). There is a consensus that these assumptions are not always true for these data and, thus, several methods have been devised to transform the data or weight them prior to analysis by PCA. The structure of NMR measurement noise, or the extent to which violations of error homoscedasticity affect PCA results have neither been characterized nor investigated. A comprehensive characterization of measurement uncertainties in NMR based metabolomics was achieved in this work using an experiment designed to capture contributions of several sources of error to the total variance in the measurements. The noise structure was found to be heteroscedastic and highly correlated with spectral characteristics that are similar to the mean of the spectra and their standard deviation. A model was subsequently developed that potentially allows errors in NMR measurements to be accurately estimated without the need for extensive replication.

[1]  J. Lindon,et al.  Metabonomics: a platform for studying drug toxicity and gene function , 2002, Nature Reviews Drug Discovery.

[2]  T. Ebbels,et al.  Improved analysis of multivariate data by variable stability scaling: application to NMR-based metabolic profiling , 2003 .

[3]  Yizeng Liang,et al.  Preprocessing of analytical profiles in the presence of homoscedastic or heteroscedastic noise , 1994 .

[4]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[5]  J. Lindon,et al.  The identification of novel biomarkers of renal toxicity using automatic data reduction techniques and PCA of proton NMR spectra of urine , 1998 .

[6]  David M. Rocke,et al.  Discrimination models using variance-stabilizing transformation of metabolomic NMR data. , 2004, Omics : a journal of integrative biology.

[7]  W. Dunn,et al.  Measuring the metabolome: current analytical technologies. , 2005, The Analyst.

[8]  Darren T. Andrews,et al.  Maximum likelihood principal component analysis , 1997 .

[9]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[10]  M. Viant Improved methods for the acquisition and interpretation of NMR metabolomic data. , 2003, Biochemical and biophysical research communications.

[11]  J. Lindon,et al.  'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. , 1999, Xenobiotica; the fate of foreign compounds in biological systems.

[12]  Edmund R. Malinowski,et al.  Factor Analysis in Chemistry , 1980 .

[13]  Peter D. Wentzell,et al.  Maximum likelihood principal component analysis with correlated measurement errors: theoretical and practical considerations , 1999 .

[14]  T. Wichmann The Analyst , 1958, Nature.

[15]  Peter D. Wentzell,et al.  Hazards of digital smoothing filters as a preprocessing tool in multivariate calibration , 1999 .

[16]  Remo Guidieri Res , 1995, RES: Anthropology and Aesthetics.

[17]  Philip K. Hopke,et al.  Discarding or downweighting high-noise variables in factor analytic models , 2003 .

[18]  M. Akke,et al.  A statistical analysis of NMR spectrometer noise. , 2003, Journal of magnetic resonance.

[19]  Johanna Smeyers-Verbeke,et al.  Handbook of Chemometrics and Qualimetrics: Part A , 1997 .

[20]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[21]  Lorenzo J. Vega-Montoto,et al.  Methods for systematic investigation of measurement error covariance matrices , 2005 .