Cross-Platform Microarray Data Normalisation for Regulatory Network Inference

Background Inferring Gene Regulatory Networks (GRNs) from time course microarray data suffers from the dimensionality problem created by the short length of available time series compared to the large number of genes in the network. To overcome this, data integration from diverse sources is mandatory. Microarray data from different sources and platforms are publicly available, but integration is not straightforward, due to platform and experimental differences. Methods We analyse here different normalisation approaches for microarray data integration, in the context of reverse engineering of GRN quantitative models. We introduce two preprocessing approaches based on existing normalisation techniques and provide a comprehensive comparison of normalised datasets. Conclusions Results identify a method based on a combination of Loess normalisation and iterative K-means as best for time series normalisation for this problem.

[1]  William Stafford Noble,et al.  The Forkhead transcription factor Hcm1 regulates chromosome segregation genes and fills the S-phase gap in the transcriptional circuitry of the cell cycle. , 2006, Genes & development.

[2]  Gil Alterovitz,et al.  The challenges of informatics in synthetic biology: from biomolecular networks to artificial organisms , 2010, Briefings Bioinform..

[3]  H. Iba,et al.  Inferring Gene Regulatory Networks using Differential Evolution with Local Search Heuristics , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  Hongzhe Li,et al.  Co-expression networks: graph properties and topological comparisons , 2010, Bioinform..

[5]  G. Kerr,et al.  Comparison of Microarray Pre-Processing Methods , 2009 .

[6]  Mona Singh,et al.  Toward the dynamic interactome: it's about time , 2010, Briefings Bioinform..

[7]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[9]  Andreas Zell,et al.  Clustering-based approach to identify solutions for the inference of regulatory networks , 2005, 2005 IEEE Congress on Evolutionary Computation.

[10]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[11]  J. Do,et al.  Normalization of microarray data: single-labeled and dual-labeled arrays. , 2006, Molecules and cells.

[12]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[13]  L. Breeden,et al.  Conserved homeodomain proteins interact with MADS box protein Mcm1 to restrict ECB-dependent transcription to the M/G1 phase of the cell cycle. , 2002, Genes & development.

[14]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[15]  Andrew B. Nobel,et al.  Merging two gene-expression studies via cross-platform normalization , 2008, Bioinform..

[16]  Jaap A. Kaandorp,et al.  Inferring Drosophila gap gene regulatory network: a parameter sensitivity and perturbation analysis , 2009, BMC Systems Biology.

[17]  M Crane,et al.  Comparison of microarray preprocessing methods. , 2010, Advances in experimental medicine and biology.

[18]  W. Kilmer A Friendly Guide To Wavelets , 1998, Proceedings of the IEEE.

[19]  Kai Wang,et al.  Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks , 2007, ISMB/ECCB.

[20]  Joshua E. S. Socolar,et al.  Global control of cell-cycle transcription by coupled CDK and network oscillators , 2008, Nature.

[21]  Trupti Joshi,et al.  Inferring gene regulatory networks from multiple microarray datasets , 2006, Bioinform..

[22]  LiHongzhe,et al.  Co-expression networks , 2010 .

[23]  David L. Donoho,et al.  WaveLab and Reproducible Research , 1995 .

[24]  Hitoshi Iba,et al.  Inference of genetic networks using S-system: information criteria for model selection , 2006, GECCO.

[25]  Paul A. Watters,et al.  Statistics in a nutshell , 2008 .

[26]  Kiyoko F. Aoki-Kinoshita,et al.  Gene annotation and pathway mapping in KEGG. , 2007, Methods in molecular biology.

[27]  Michael A. Savageau,et al.  Introduction to S-systems and the underlying power-law formalism , 1988 .