The phenotype and genotype experiment object model (PaGE‐OM): a robust data structure for information related to DNA variation

Torrents of genotype–phenotype data are being generated, all of which must be captured, processed, integrated, and exploited. To do this optimally requires the use of standard and interoperable “object models,” providing a description of how to partition the total spectrum of information being dealt with into elemental “objects” (such as “alleles,” “genotypes,” “phenotype values,” “methods”) with precisely stated logical interrelationships (such as “A objects are made up from one or more B objects”). We herein propose the Phenotype and Genotype Experiment Object Model (PaGE‐OM; www.pageom.org), which has been tested and implemented in conjunction with several major databases, and approved as a standard by the Object Management Group (OMG). PaGE‐OM is open‐source, ready for use by the wider community, and can be further developed as needs arise. It will help to improve information management, assist data integration, and simplify the task of informatics resource design and construction for genotype and phenotype data projects.Hum Mutat 30, 968–977, 2009. © 2009 Wiley‐Liss, Inc.

[1]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[2]  R. Cotton Recommendations of the 2006 Human Variome Project meeting , 2007, Nature Genetics.

[3]  P. Donnelly,et al.  Replicating genotype–phenotype associations , 2007, Nature.

[4]  B. Knoppers,et al.  Population Genomics: The Public Population Project in Genomics (P3G): a proof of concept? , 2008, European Journal of Human Genetics.

[5]  Sarah Lewis,et al.  Genetic epidemiology and public health: hope, hype, and future prospects , 2005, The Lancet.

[6]  Kei-Hoi Cheung,et al.  ALFRED: the ALelle FREquency Database. Update , 2003, Nucleic Acids Res..

[7]  Gudmundur A. Thorisson,et al.  Genotype–phenotype databases: challenges and solutions for the post-genomic era , 2009, Nature Reviews Genetics.

[8]  Debasis Dash,et al.  The Indian Genome Variation database (IGVdb): a project overview , 2005, Human Genetics.

[9]  Jason E. Stewart,et al.  Design and implementation of microarray gene expression markup language (MAGE-ML) , 2002, Genome Biology.

[10]  Russ B Altman,et al.  PharmGKB: a logical home for knowledge relating genotype to drug response phenotype , 2007, Nature Genetics.

[11]  R B Altman,et al.  An XML‐based interchange format for genotype–phenotype data , 2008, Human mutation.

[12]  D. Fredman,et al.  HGVbase: a curated resource describing human DNA variation and phenotype relationships , 2004, Nucleic Acids Res..

[13]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[14]  Debasis Dash,et al.  HGVbaseG2P: a central genetic association database , 2008, Nucleic Acids Res..

[15]  Gudmundur A. Thorisson,et al.  The International HapMap Project Web site. , 2005, Genome research.

[16]  Chris F. Taylor,et al.  The MGED Ontology: a resource for semantics-based description of microarray experiments , 2006, Bioinform..

[17]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[18]  E. Mardis The impact of next-generation sequencing technology on genetics. , 2008, Trends in genetics : TIG.

[19]  Yusuke Nakamura,et al.  JSNP: a database of common gene variations in the Japanese population , 2002, Nucleic Acids Res..

[20]  B. C. Mishra,et al.  Consortium IGVThe Indian Genome Variation database (IGVdb): a project overview. Hum Genet 118:1-11 , 2005 .

[21]  Nigel W. Hardy,et al.  The Functional Genomics Experiment model (FuGE): an extensible framework for standards in functional genomics , 2007, Nature Biotechnology.