Platform dependence of inference on gene-wise and gene-set involvement in human lung development

BackgroundWith the recent development of microarray technologies, the comparability of gene expression data obtained from different platforms poses an important problem. We evaluated two widely used platforms, Affymetrix U133 Plus 2.0 and the Illumina HumanRef-8 v2 Expression Bead Chips, for comparability in a biological system in which changes may be subtle, namely fetal lung tissue as a function of gestational age.ResultsWe performed the comparison via sequence-based probe matching between the two platforms. "Significance grouping" was defined as a measure of comparability. Using both expression correlation and significance grouping as measures of comparability, we demonstrated that despite overall cross-platform differences at the single gene level, increased correlation between the two platforms was found in genes with higher expression level, higher probe overlap, and lower p-value. We also demonstrated that biological function as determined via KEGG pathways or GO categories is more consistent across platforms than single gene analysis.ConclusionWe conclude that while the comparability of the platforms at the single gene level may be increased by increasing sample size, they are highly comparable ontologically even for subtle differences in a relatively small sample size. Biologically relevant inference should therefore be reproducible across laboratories using different platforms.

[1]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[2]  Yudi Pawitan,et al.  Filtering genes to improve sensitivity in oligonucleotide microarray data analysis. , 2007, Nucleic acids research.

[3]  Soumyaroop Bhattacharya,et al.  Epithelial cell PPARgamma is an endogenous regulator of normal lung maturation and maintenance. , 2006, Proceedings of the American Thoracic Society.

[4]  Z. Szallasi,et al.  Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements. , 2004, Nucleic acids research.

[5]  R. Shippy,et al.  Performance evaluation of commercial short-oligonucleotide microarrays and the impact of noise in making cross-platform correlations , 2004, BMC Genomics.

[6]  F. Cambien,et al.  Performance comparison of two microarray platforms to assess differential gene expression in human monocyte and macrophage cells , 2008, BMC Genomics.

[7]  Michael G. Barnes,et al.  Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms , 2005, Nucleic acids research.

[8]  L. Ohno-Machado,et al.  Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates , 2007, BMC Genomics.

[9]  Monika Milewski,et al.  Decoding randomly ordered DNA arrays. , 2004, Genome research.

[10]  Isaac S. Kohane,et al.  Expression profiles of the mouse lung identify a molecular signature of time-to-birth. , 2009, American journal of respiratory cell and molecular biology.

[11]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[12]  Maqc Consortium The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements , 2006, Nature Biotechnology.

[13]  Andrew B. Nobel,et al.  Significance analysis of functional categories in gene expression studies: a structured permutation approach , 2005, Bioinform..

[14]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[15]  Daniel J. Park,et al.  A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies , 2006, Nature Biotechnology.

[16]  S. Young,et al.  p Value Adjustments for Multiple Tests in Multivariate Binomial Models , 1989 .

[17]  R. Vossen,et al.  Can subtle changes in gene expression be consistently detected with different microarray platforms? , 2008, BMC Genomics.

[18]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[19]  James J. Chen,et al.  Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data , 2007, BMC Bioinformatics.

[20]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[21]  Guy Perrière,et al.  Cross-platform comparison and visualisation of gene expression data using co-inertia analysis , 2003, BMC Bioinformatics.

[22]  Sandra Healy,et al.  Cross platform microarray analysis for robust identification of differentially expressed genes , 2007, BMC Bioinformatics.

[23]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[24]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..