Standards for systems biology

High-throughput technologies are generating large amounts of complex data that have to be stored in databases, communicated to various data analysis tools and interpreted by scientists. Data representation and communication standards are needed to implement these steps efficiently. Here we give a classification of various standards related to systems biology and discuss various aspects of standardization in life sciences in general. Why are some standards more successful than others, what are the prerequisites for a standard to succeed and what are the possible pitfalls?

[1]  J. Tyson Modeling the cell division cycle: cdc2 and cyclin interactions. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Chi-Ying F. Huang,et al.  Ultrasensitivity in the mitogen-activated protein kinase cascade. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[3]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[4]  Graham Cameron,et al.  One-stop shop for microarray data , 2000, Nature.

[5]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[6]  Alvis Brazma,et al.  On the Importance of Standardisation in Life Sciences , 2001, Bioinform..

[7]  T. N. Bhat,et al.  The CCPN project: an interim report on a data model for the NMR community , 2002, Nature Structural Biology.

[8]  Jeffrey T. Chang,et al.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. , 2002, Genome research.

[9]  Microarray standards at last , 2002, Nature.

[10]  C. Ball,et al.  Microarray databases: standards and ontologies , 2002, Nature Genetics.

[11]  Chris F. Taylor,et al.  A systematic approach to modeling, capturing, and disseminating proteomics experimental data , 2003, Nature Biotechnology.

[12]  Andrew Jones,et al.  Proposal for a Standard Representation of Two-Dimensional Gel Electrophoresis Data , 2003, Comparative and functional genomics.

[13]  C. V. Jongeneel,et al.  eVOC: a controlled vocabulary for unifying gene expression data. , 2003, Genome research.

[14]  David Botstein,et al.  The Stanford Microarray Database: data access and quality assessment tools , 2003, Nucleic Acids Res..

[15]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[16]  Christian J. Stoeckert,et al.  Minimum information about a functional genomics experiment: the state of microarray standards and their extension to other technologies , 2004 .

[17]  Chris F. Taylor,et al.  A common open representation of mass spectrometry data and its application to proteomics research , 2004, Nature Biotechnology.

[18]  M. Ashburner,et al.  An ontology for cell types , 2005, Genome Biology.

[19]  Nigel W. Hardy,et al.  A proposed framework for the description of plant metabolomics experiments and their results , 2004, Nature Biotechnology.

[20]  Ruedi Aebersold,et al.  The Need for Guidelines in Publication of Peptide and Protein Identification Data , 2004, Molecular & Cellular Proteomics.

[21]  Rolf Apweiler,et al.  Common interchange standards for proteomics data: Public availability of tools and schema. Report on the Proteomic Standards Initiative Workshop, 2nd Annual HUPO Congress, Montreal, Canada, 8–11th October 2003 , 2004, Proteomics.

[22]  Catherine M Lloyd,et al.  CellML: its future, present and past. , 2004, Progress in biophysics and molecular biology.

[23]  C. Ball,et al.  Submission of Microarray Data to Public Repositories , 2004, PLoS biology.

[24]  J. Bard,et al.  Ontologies in biology: design, applications and future challenges , 2004, Nature Reviews Genetics.

[25]  John Quackenbush,et al.  Data standards for 'omic' science , 2004, Nature Biotechnology.

[26]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[27]  John R. Yates,et al.  CEBS object model for systems biology data, SysBio-OM , 2004, Bioinform..

[28]  Ela Hunt,et al.  An object model and database for functional genomics , 2004, Bioinform..

[29]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[30]  Rolf Apweiler,et al.  Human Proteome Organisation Proteomics Standards Initiative Pre‐Congress Initiative , 2005, Proteomics.

[31]  Marvin Cassman,et al.  Barriers to progress in systems biology , 2005, Nature.

[32]  Dennis B. Troup,et al.  NCBI GEO: mining millions of expression profiles—database and tools , 2004, Nucleic Acids Res..

[33]  Nigel W. Hardy,et al.  Summary recommendations for standardization and reporting of metabolic analyses , 2005, Nature Biotechnology.

[34]  Lincoln Stein,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Res..

[35]  E. Fuchs,et al.  Molecular Dissection of Mesenchymal–Epithelial Interactions in the Hair Follicle , 2005, PLoS biology.

[36]  Alvis Brazma,et al.  Modelling gene networks at different organisational levels , 2005, FEBS letters.

[37]  Paul T. Spellman A Status Report on Mage , 2005, Bioinform..

[38]  Sergio Contrino,et al.  The ArrayExpress gene expression database: a software engineering and implementation perspective , 2005, Bioinform..

[39]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[40]  Douglas A. Creager,et al.  The Open Microscopy Environment (OME) Data Model and XML file: open tools for informatics and quantitative analysis in biological imaging , 2005, Genome Biology.

[41]  Joanne S. Luciano,et al.  PAX of mind for pathway researchers. , 2005, Drug discovery today.

[42]  Patrick Lambrix,et al.  Representations of molecular pathways: an evaluation of SBML, PSI MI and BioPAX , 2005, Bioinform..

[43]  Hugh D. Spence,et al.  Minimum information requested in the annotation of biochemical models (MIRIAM) , 2005, Nature Biotechnology.

[44]  Amnon Shabo,et al.  Model Formulation: HL7 Clinical Document Architecture, Release 2 , 2006, J. Am. Medical Informatics Assoc..

[45]  The HUGO Gene Nomenclature Database, 2006 updates , 2005, Nucleic Acids Res..

[46]  Robert Gentleman,et al.  Top-down standards will not serve systems biology , 2006, Nature.

[47]  Alvis Brazma,et al.  Modelling in molecular biology: describing transcription regulatory networks at different scales , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[48]  Chris F. Taylor,et al.  The MGED Ontology: a resource for semantics-based description of microarray experiments , 2006, Bioinform..

[49]  Jacky L. Snoep,et al.  BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems , 2005, Nucleic Acids Res..