Cross-platform comparison and visualisation of gene expression data using co-inertia analysis

BackgroundRapid development of DNA microarray technology has resulted in different laboratories adopting numerous different protocols and technological platforms, which has severely impacted on the comparability of array data. Current cross-platform comparison of microarray gene expression data are usually based on cross-referencing the annotation of each gene transcript represented on the arrays, extracting a list of genes common to all arrays and comparing expression data of this gene subset. Unfortunately, filtering of genes to a subset represented across all arrays often excludes many thousands of genes, because different subsets of genes from the genome are represented on different arrays. We wish to describe the application of a powerful yet simple method for cross-platform comparison of gene expression data. Co-inertia analysis (CIA) is a multivariate method that identifies trends or co-relationships in multiple datasets which contain the same samples. CIA simultaneously finds ordinations (dimension reduction diagrams) from the datasets that are most similar. It does this by finding successive axes from the two datasets with maximum covariance. CIA can be applied to datasets where the number of variables (genes) far exceeds the number of samples (arrays) such is the case with microarray analyses.ResultsWe illustrate the power of CIA for cross-platform analysis of gene expression data by using it to identify the main common relationships in expression profiles on a panel of 60 tumour cell lines from the National Cancer Institute (NCI) which have been subjected to microarray studies using both Affymetrix and spotted cDNA array technology. The co-ordinates of the CIA projections of the cell lines from each dataset are graphed in a bi-plot and are connected by a line, the length of which indicates the divergence between the two datasets. Thus, CIA provides graphical representation of consensus and divergence between the gene expression profiles from different microarray platforms. Secondly, the genes that define the main trends in the analysis can be easily identified.ConclusionsCIA is a robust, efficient approach to coupling of gene expression datasets. CIA provides simple graphical representations of the results making it a particularly attractive method for the identification of relationships between large datasets.

[1]  Eivind Hovig,et al.  Differential expression patterns of S100a2, S100a4 and S100a6 during progression of human malignant melanoma , 1997, International journal of cancer.

[2]  Jean Thioulouse,et al.  CO‐INERTIA ANALYSIS AND THE LINKING OF ECOLOGICAL DATA TABLES , 2003 .

[3]  Wolfgang Henrich,et al.  The prognostic significance of epithelial-mesenchymal transition in breast cancer. , 2002, Anticancer research.

[4]  Alexander Pertsemlidis,et al.  ARROGANT: an application to manipulate large gene collections , 2002, Bioinform..

[5]  J. Hoheisel,et al.  Correspondence analysis applied to microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Jan Fagerberg,et al.  MAb17-1A and cytokines for the treatment of patients with colorectal carcinoma. , 2002, Hybridoma and hybridomics.

[7]  P J Catalano,et al.  Molecular predictors of survival after adjuvant chemotherapy for colon cancer. , 2001, The New England journal of medicine.

[8]  Jean Thioulouse,et al.  Procrustean co-inertia analysis for the linking of multivariate datasets , 2003 .

[9]  Lucila Ohno-Machado,et al.  Analysis of matched mRNA measurements from two different microarray technologies , 2002, Bioinform..

[10]  G. Watanabe,et al.  Hypermethylation of the CDKN2A gene in colorectal cancer is associated with shorter survival. , 2003, Oncology reports.

[11]  C. Braak Canonical Correspondence Analysis: A New Eigenvector Technique for Multivariate Direct Gradient Analysis , 1986 .

[12]  R. Gittins,et al.  Canonical Analysis: A Review with Applications in Ecology , 1985 .

[13]  中島 尊,et al.  Neural-cadherin expression associated with angiogenesis in non-small-cell lung cancer patients , 2004 .

[14]  J. Laï,et al.  Nonrandom fusion of L‐Plastin(LCP1) and LAZ3(BCL6) genes by t(3;13)(q27;q14) chromosome translocation in two cases of B‐cell non‐Hodgkin lymphoma , 1999, Genes, chromosomes & cancer.

[15]  Duccio Cavalieri,et al.  Standards for Microarray Data , 2002, Science.

[16]  Guy Perrière,et al.  Between-group analysis of microarray data , 2002, Bioinform..

[17]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[18]  David E Fisher,et al.  Comparison of five antibodies as markers in the diagnosis of melanoma in cytologic preparations. , 2002, American journal of clinical pathology.

[19]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[20]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[21]  Jean Thioulouse,et al.  The use of permutation tests in co-inertia analysis : application to the study of nematode-soil relationships , 1995 .

[22]  P. Robert,et al.  A Unifying Tool for Linear Multivariate Statistical Methods: The RV‐Coefficient , 1976 .

[23]  Jean Thioulouse,et al.  Co-inertia analysis of amino-acid physico-chemical properties and protein composition with the ADE package , 1995, Comput. Appl. Biosci..

[24]  S. Dolédec,et al.  Co‐inertia analysis: an alternative method for studying species–environment relationships , 1994 .

[25]  J. Mesirov,et al.  Chemosensitivity prediction by transcriptional profiling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  A. Butte,et al.  Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[27]  J. Thiery Epithelial–mesenchymal transitions in tumour progression , 2002, Nature Reviews Cancer.

[28]  S Thameem Dheen,et al.  Metallothionein 2A expression is associated with cell proliferation in breast cancer. , 2002, Carcinogenesis.

[29]  David Botstein,et al.  SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data , 2003, Nucleic Acids Res..

[30]  A. V. van Kessel,et al.  Expression of nma, a novel gene, inversely correlates with the metastatic potential of human melanoma cell lines and xenografts , 1996, International journal of cancer.

[31]  J. Weinstein,et al.  Pharmacogenomic analysis: correlating molecular substructure classes with microarray gene expression data , 2002, The Pharmacogenomics Journal.

[32]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Andrew J. Holloway,et al.  Options available—from start to finish—for obtaining data from DNA microarrays II , 2002, Nature Genetics.

[34]  E. Lander,et al.  A molecular signature of metastasis in primary solid tumors , 2003, Nature Genetics.

[35]  T. Barrette,et al.  Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. , 2002, Cancer research.

[36]  William C Reinhold,et al.  MatchMiner: a tool for batch navigation among gene and gene product identifiers , 2003, Genome Biology.

[37]  Jean Thioulouse,et al.  ADE-4: a multivariate analysis and graphical display software , 1997, Stat. Comput..