NMR metabolic analysis of samples using fuzzy K‐means clustering

The global analysis of metabolites can be used to define the phenotypes of cells, tissues or organisms. Classifying groups of samples based on their metabolic profile is one of the main topics of metabolomics research. Crisp clustering methods assign each feature to one cluster, thereby omitting information about the multiplicity of sample subtypes. Here, we present the application of fuzzy K‐means clustering method for the classification of samples based on metabolomics 1D 1H NMR fingerprints. The sample classification was performed on NMR spectra of cancer cell line extracts and of urine samples of type 2 diabetes patients and animal models. The cell line dataset included NMR spectra of lipophilic cell extracts for two normal and three cancer cell lines with cancer cell lines including two invasive and one non‐invasive cancers. The second dataset included previously published NMR spectra of urine samples of human type 2 diabetics and healthy controls, mouse wild type and diabetes model and rat obese and lean phenotypes. The fuzzy K‐means clustering method allowed more accurate sample classification in both datasets relative to the other tested methods including principal component analysis (PCA), hierarchical clustering (HCL) and K‐means clustering. In the cell line samples, fuzzy clustering provided a clear separation of individual cell lines, groups of cancer and normal cell lines as well as non‐invasive and invasive tumour cell lines. In the diabetes dataset, clear separation of healthy controls and diabetics in all three models was possible only by using the fuzzy clustering method. Copyright © 2009 Crown in the right of Canada. Published by John Wiley & Sons, Ltd.

[1]  John Quackenbush Microarrays--Guilt by Association , 2003, Science.

[2]  A. Barabasi,et al.  Human disease classification in the postgenomic era: A complex systems approach to human pathobiology , 2007, Molecular systems biology.

[3]  T. Ebbels,et al.  NMR-based metabonomic toxicity classification: hierarchical cluster analysis and k-nearest-neighbour approaches , 2003 .

[4]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[5]  Chen Yang,et al.  Comparative Metabolomics of Breast Cancer , 2006, Pacific Symposium on Biocomputing.

[6]  A. Bowcock Genomics: Guilt by association , 2007, Nature.

[7]  Ian D. Wilson,et al.  Metabolic Phenotyping in Health and Disease , 2008, Cell.

[8]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[9]  R. Cox,et al.  A metabolomic comparison of urinary changes in type 2 diabetes in mouse, rat, and human. , 2007, Physiological genomics.

[10]  Rafael Brüschweiler,et al.  Web server based complex mixture analysis by NMR. , 2008, Analytical chemistry.

[11]  Robert Powers,et al.  Negative impact of noise on the principal component analysis of NMR data. , 2006, Journal of magnetic resonance.

[12]  J. Nicholson,et al.  NMR and pattern recognition studies on liver extracts and intact livers from rats treated with alpha-naphthylisothiocyanate. , 2002, Biochemical pharmacology.

[13]  Julian L. Griffin,et al.  Metabolic profiles of cancer cells , 2004, Nature Reviews Cancer.

[14]  Yury Tikunov,et al.  A Novel Approach for Nontargeted Data Analysis for Metabolomics. Large-Scale Profiling of Tomato Fruit Volatiles1[w] , 2005, Plant Physiology.

[15]  M. Lacroix,et al.  Relevance of Breast Cancer Cell Lines as Models for Breast Tumours: An Update , 2004, Breast Cancer Research and Treatment.

[16]  Nabil Belacel,et al.  Fuzzy J-Means and VNS methods for clustering genes from microarray data , 2004, Bioinform..

[17]  Wen-Lin Kuo,et al.  A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. , 2006, Cancer cell.

[18]  M. Ala-Korpela Potential role of body fluid 1H NMR metabonomics as a prognostic and diagnostic tool , 2007, Expert review of molecular diagnostics.

[19]  K. Kaski,et al.  1H NMR metabonomics approach to the disease continuum of diabetic complications and premature death , 2008, Molecular systems biology.

[20]  T. Kieber‐Emmons,et al.  Applying in vitro NMR spectroscopy and 1H NMR metabonomics to breast cancer characterization and detection , 2005 .

[21]  Age K. Smilde,et al.  UvA-DARE ( Digital Academic Repository ) Assessment of PLSDA cross validation , 2008 .

[22]  W. J. Dyer,et al.  A rapid method of total lipid extraction and purification. , 1959, Canadian journal of biochemistry and physiology.

[23]  Erin E. Carlson,et al.  Targeted profiling: quantitative analysis of 1H NMR metabolomics data. , 2006, Analytical chemistry.

[24]  J. A. Westerhuis,et al.  Bagged K-Means Clustering of Metabolome Data , 2006 .

[25]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..