Automated data integration for developmental biological research

In an era exploding with genome-scale data, a major challenge for developmental biologists is how to extract significant clues from these publicly available data to benefit our studies of individual genes, and how to use them to improve our understanding of development at a systems level. Several studies have successfully demonstrated new approaches to classic developmental questions by computationally integrating various genome-wide data sets. Such computational approaches have shown great potential for facilitating research: instead of testing 20,000 genes, researchers might test 200 to the same effect. We discuss the nature and state of this art as it applies to developmental research.

[1]  B. Dickson,et al.  A genome-wide transgenic RNAi library for conditional gene inactivation in Drosophila , 2007, Nature.

[2]  Paul W. Sternberg,et al.  The versatile worm: genetic and genomic resources for Caenorhabditis elegans research , 2007, Nature Reviews Genetics.

[3]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[4]  H. Bussey,et al.  Exploring genetic interactions and networks with yeast , 2007, Nature Reviews Genetics.

[5]  V. Ambros,et al.  The regulation of genes and genomes by small RNAs , 2007, Development.

[6]  Grant W. Brown,et al.  Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map , 2007, Nature.

[7]  Paul W Sternberg,et al.  The tailless Ortholog nhr-67 Regulates Patterning of Gene Expression and Morphogenesis in the C. elegans Vulva , 2007, PLoS genetics.

[8]  Manolis Kellis,et al.  Whole-genome ChIP-chip analysis of Dorsal, Twist, and Snail suggests integration of diverse patterning processes in the Drosophila embryo. , 2007, Genes & development.

[9]  Allan R. Jones,et al.  Genome-wide atlas of gene expression in the adult mouse brain , 2007, Nature.

[10]  E. Furlong,et al.  ChIP-on-chip protocol for genome-wide analysis of transcription factor binding in Drosophila melanogaster embryos , 2006, Nature Protocols.

[11]  Judith A. Blake,et al.  The mouse genome database (MGD): new features facilitating a model system , 2006, Nucleic Acids Res..

[12]  Kimberly Van Auken,et al.  WormBase: new content and better access , 2006, Nucleic Acids Res..

[13]  Madeline A. Crosby,et al.  FlyBase: genomes by the dozen , 2006, Nucleic Acids Res..

[14]  A. Fraser,et al.  Systematic mapping of genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling pathways , 2006, Nature Genetics.

[15]  Gang Liu,et al.  Automatic clustering of orthologs and inparalogs shared by multiple proteomes , 2006, ISMB.

[16]  Christian A. Grove,et al.  A Gene-Centered C. elegans Protein-DNA Interaction Network , 2006, Cell.

[17]  T. Ideker,et al.  Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae , 2006, Journal of biology.

[18]  Madhusudan Natarajan,et al.  A global analysis of cross-talk in a mammalian cellular signalling network , 2006, Nature Cell Biology.

[19]  I. Hope,et al.  Caenorhabditis elegans reporter fusion genes generated by seamless modification of large genomic DNA clones , 2006, Nucleic acids research.

[20]  Michael D. Kim,et al.  Genome-wide analyses identify transcription factors required for proper morphogenesis of Drosophila sensory neuron dendrites. , 2006, Genes & development.

[21]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[22]  T. Curran,et al.  BGEM: An In Situ Hybridization Database of Gene Expression in the Embryonic and Adult Mouse Nervous System , 2006, PLoS biology.

[23]  B. Palsson,et al.  The model organism as a system: integrating 'omics' data sets , 2006, Nature Reviews Molecular Cell Biology.

[24]  Nicholas Burton,et al.  EMAGE: a spatial database of gene expression patterns during mouse embryo development , 2005, Nucleic Acids Res..

[25]  Monte Westerfield,et al.  The Zebrafish Information Network: the zebrafish model organism database , 2005, Nucleic Acids Res..

[26]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[27]  Paramvir S. Dehal,et al.  TreeFam: a curated database of phylogenetic trees of animal gene families , 2005, Nucleic Acids Res..

[28]  Raymond Y. N. Lee Web resources for C. elegans studies. , 2005, WormBook : the online review of C. elegans biology.

[29]  Sean R. Collins,et al.  Exploration of the Function and Organization of the Yeast Early Secretory Pathway through an Epistatic Miniarray Profile , 2005, Cell.

[30]  S. Fields High‐throughput two‐hybrid analysis , 2005, The FEBS journal.

[31]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[32]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[33]  A. Tong,et al.  A Network of Multi-Tasking Proteins at the DNA Replication Fork Preserves Genome Stability , 2005, PLoS genetics.

[34]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[35]  Marc Vidal,et al.  Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis , 2005, Nature.

[36]  T. Barrette,et al.  Probabilistic model of the human protein-protein interaction network , 2005, Nature Biotechnology.

[37]  M. Gerstein,et al.  Assessing the limits of genomic data integration for predicting protein networks. , 2005, Genome research.

[38]  Keith A. Boroevich,et al.  Functional Genomics of the Cilium, a Sensory Organelle , 2005, Current Biology.

[39]  John Quackenbush,et al.  Multiple-laboratory comparison of microarray platforms , 2005, Nature Methods.

[40]  Bryan Frank,et al.  Independence and reproducibility across microarray platforms , 2005, Nature Methods.

[41]  Kathleen F. Kerr,et al.  Standardizing global gene expression analysis between laboratories and across platforms , 2005, Nature Methods.

[42]  Donna K Slonim,et al.  Synthetic lethal analysis of Caenorhabditis elegans posterior embryonic patterning genes identifies conserved genetic interactions , 2005, Genome Biology.

[43]  Christof Niehrs,et al.  An atlas of differential gene expression during early Xenopus embryogenesis , 2005, Mechanisms of Development.

[44]  Yang Liu,et al.  Mouse Brain Organization Revealed Through Direct Genome-Scale TF Expression Analysis , 2004, Science.

[45]  S. L. Wong,et al.  Combining biological networks to predict genetic interactions. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[46]  David E Hill,et al.  A first version of the Caenorhabditis elegans Promoterome. , 2004, Genome research.

[47]  Norbert Perrimon,et al.  Parallel Chemical Genetic and Genome-Wide RNAi Screens Identify Cytokinesis Inhibitors and Targets , 2004, PLoS biology.

[48]  R. Plasterk,et al.  Gene interactions in the DNA damage-response pathway identified by genome-wide RNA-interference analysis of synthetic lethality. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[49]  S. Henikoff,et al.  TILLING. Traditional Mutagenesis Meets Functional Genomics , 2004, Plant Physiology.

[50]  John Quackenbush,et al.  Data standards for 'omic' science , 2004, Nature Biotechnology.

[51]  Reuven Agami,et al.  A large-scale RNAi screen in human cells identifies new components of the p53 pathway , 2004, Nature.

[52]  Gary D Bader,et al.  Global Mapping of the Yeast Genetic Interaction Network , 2004, Science.

[53]  N. Perrimon,et al.  Genome-Wide RNAi Analysis of Growth and Viability in Drosophila Cells , 2004, Science.

[54]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[55]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[56]  Edwin Cuppen,et al.  Efficient target-selected mutagenesis in zebrafish. , 2003, Genome research.

[57]  Shiaoching Gong,et al.  A gene expression atlas of the central nervous system based on bacterial artificial chromosomes , 2003, Nature.

[58]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[59]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[60]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[61]  Andrew G Fraser,et al.  Genome-Wide RNAi of C. elegans Using the Hypersensitive rrf-3 Strain Reveals Novel Gene Functions , 2003, PLoS biology.

[62]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[63]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[64]  A. Owen,et al.  A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae) , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[65]  Y. Dong,et al.  Systematic functional analysis of the Caenorhabditis elegans genome using RNAi , 2003, Nature.

[66]  Thomas Tuschl,et al.  Functional genomics: RNA sets the standard , 2003, Nature.

[67]  M. Ashburner,et al.  Systematic determination of patterns of gene expression during Drosophila embryogenesis , 2002, Genome Biology.

[68]  F. Piano,et al.  Gene Clustering Based on RNAi Phenotypes of Ovary-Enriched Genes in C. elegans , 2002, Current Biology.

[69]  Marc Vidal,et al.  Integrating Interactome, Phenome, and Transcriptome Mapping Data for the C. elegans Germline , 2002, Current Biology.

[70]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[71]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[72]  M. Gerstein,et al.  Subcellular localization of the yeast proteome. , 2002, Genes & development.

[73]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[74]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[75]  Gary D Bader,et al.  Systematic Genetic Analysis with Ordered Arrays of Yeast Deletion Mutants , 2001, Science.

[76]  Christian E. V. Storm,et al.  Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. , 2001, Journal of molecular biology.

[77]  Joshua M. Stuart,et al.  A Gene Expression Map for Caenorhabditis elegans , 2001, Science.

[78]  K. White,et al.  Patterns of Gene Expression During Drosophila Mesoderm Development , 2001, Science.

[79]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[80]  E. Brown,et al.  Genomic analysis of gene expression in C. elegans. , 2000, Science.

[81]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[82]  Scott A. Rifkin,et al.  Microarray analysis of Drosophila development during metamorphosis. , 1999, Science.

[83]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[84]  A. Fire,et al.  Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans , 1998, Nature.

[85]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[86]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[87]  H. Horvitz,et al.  Caenorhabditis elegans ras gene let-60 acts as a switch in the pathway of vulval induction , 1990, Nature.

[88]  P. Sternberg,et al.  The let-60 locus controls the switch between vulval and nonvulval cell fates in Caenorhabditis elegans. , 1990, Genetics.

[89]  P. Sternberg,et al.  let-60, a gene that specifies cell fates during C. elegans vulval induction, encodes a ras protein , 1990, Cell.

[90]  I. Herskowitz,et al.  Structure of a yeast pheromone gene (MFα): A putative α-factor precursor contains four tandem copies of mature α-factor , 1982, Cell.

[91]  P. Sternberg,et al.  Supporting Online Material for Genome-wide Prediction of C. elegans Genetic Interactions , 2006 .

[92]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[93]  G. Sumara,et al.  A Probabilistic Functional Network of Yeast Genes , 2004 .

[94]  Gregor Eichele,et al.  GenePaint.org: an atlas of gene expression patterns in the mouse embryo , 2004, Nucleic Acids Res..

[95]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[96]  D L Riddle,et al.  Gene expression profiling of cells, tissues, and developmental stages of the nematode C. elegans. , 2003, Cold Spring Harbor symposia on quantitative biology.