Automatic alignment of individual peaks in large high-resolution spectral data sets.

Pattern recognition techniques are effective tools for reducing the information contained in large spectral data sets to a much smaller number of significant features which can then be used to make interpretations about the chemical or biochemical system under study. Often the effectiveness of such approaches is impeded by experimental and instrument induced variations in the position, phase, and line width of the spectral peaks. Although characterizing the cause and magnitude of these fluctuations could be important in its own right (pH-induced NMR chemical shift changes, for example) in general they obscure the process of pattern discovery. One major area of application is the use of large databases of (1)H NMR spectra of biofluids such as urine for investigating perturbations in metabolic profiles caused by drugs or disease, a process now termed metabonomics. Frequency shifts of individual peaks are the dominant source of such unwanted variations in this type of data. In this paper, an automatic procedure for aligning the individual peaks in the data set is described and evaluated. The proposed method will be vital for the efficient and automatic analysis of large metabonomic data sets and should also be applicable to other types of data.

[1]  J. Lindon,et al.  Metabonomics: a platform for studying drug toxicity and gene function , 2002, Nature Reviews Drug Discovery.

[2]  Henrik Antti,et al.  Contemporary issues in toxicology the role of metabonomics in toxicology and its evaluation by the COMET project. , 2003, Toxicology and applied pharmacology.

[3]  T R Brown,et al.  NMR spectral quantitation by principal component analysis , 2001, NMR in biomedicine.

[4]  E Holmes,et al.  Development of a model for classification of toxin‐induced lesions using 1H NMR spectroscopy of urine combined with pattern recognition , 1998, NMR in biomedicine.

[5]  T R Brown,et al.  NMR spectral quantitation by principal component analysis. III. A generalized procedure for determination of lineshape variations. , 2002, Journal of magnetic resonance.

[6]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[7]  G. Hagberg,et al.  From magnetic resonance spectroscopy to classification of tumors. A review of pattern recognition methods , 1998, NMR in biomedicine.

[8]  I. Schuppe-Koistinen,et al.  Peak alignment of NMR signals by means of a genetic algorithm , 2003 .

[9]  T R Brown,et al.  NMR spectral quantitation by principal-component analysis. II. Determination of frequency and phase shifts. , 1996, Journal of magnetic resonance. Series B.

[10]  J. Schotland,et al.  Spectral quantitation by principal component analysis using complex singular value decomposition , 1999, Magnetic resonance in medicine.

[11]  D. Louis Collins,et al.  Accurate, noninvasive diagnosis of human brain tumors by using proton magnetic resonance spectroscopy , 1996, Nature Medicine.

[12]  B. Meier,et al.  Computer Simulations in Magnetic Resonance. An Object-Oriented Programming Approach , 1994 .

[13]  A Heerschap,et al.  Automatic correction for phase shifts, frequency shifts, and lineshape distortions across a series of single resonance lines in large spectral data sets. , 2000, Journal of magnetic resonance.

[14]  E Holmes,et al.  Metabonomic investigations into hydrazine toxicity in the rat. , 2001, Chemical research in toxicology.

[15]  J. Lindon,et al.  'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. , 1999, Xenobiotica; the fate of foreign compounds in biological systems.

[16]  T R Brown,et al.  Quantitation of Resonances in Biological 31P NMR Spectra via Principal Component Analysis: Potential and Limitations , 1996, NMR in biomedicine.