Improving the Semantics of a Conceptual Schema of the Human Genome by Incorporating the Modeling of SNPs

In genetic research, the concept known as SNP, or single nucleotide polymorphism, plays an important role in detection of genes associated with complex ailments and detection of hereditary susceptibility of an individual to a specific trait. Discussing the issue, as it surfaced in the development of a conceptual schema for the human genome, it became clear a high degree of conceptual ambiguity surrounds the term. Solving this ambiguity has lead to the main research question: What makes a genetic variation, classified as a SNP different from genetic variations, not classified as SNP?. For optimal biological research to take place, an unambiguous conceptualization is required. Our main contribution is to show how conceptual modeling techniques applied to human genome concepts can help to disambiguate and correctly represent the relevant concepts in a conceptual schema, thereby achieving a deeper and more adequate understanding of the domain.

[1]  Amedeo Napoli,et al.  SNP-Converter: An Ontology-Based Solution to Reconcile Heterogeneous SNP Descriptions for Pharmacogenomic Studies , 2006, DILS.

[2]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[3]  N. Risch,et al.  A comparison of linkage disequilibrium measures for fine-scale mapping. , 1995, Genomics.

[4]  Russ B. Altman,et al.  MutDB: annotating human variation with functionally relevant data , 2003, Bioinform..

[5]  Christian S. Jensen,et al.  Capturing Temporal Constraints in Temporal ER Models , 2008, ER.

[6]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[7]  B. Shastry SNPs: impact on gene function and phenotype. , 2009, Methods in molecular biology.

[8]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[9]  J. Moult,et al.  Identification and analysis of deleterious human SNPs. , 2006, Journal of molecular biology.

[10]  Carole A. Goble,et al.  Conceptual modelling of genomic information , 2000, Bioinform..

[11]  Oscar Pastor,et al.  Enforcing Conceptual Modeling to improve the understanding of human genome , 2010, 2010 Fourth International Conference on Research Challenges in Information Science (RCIS).

[12]  Oscar Pastor,et al.  Model-driven architecture in practice - a software production environment based on conceptual modeling , 2007 .

[13]  A. Vignal,et al.  A review on SNP and other types of molecular markers and their use in animal genetics , 2002, Genetics Selection Evolution.

[14]  Mathew W. Wright,et al.  The HUGO Gene Nomenclature Committee (HGNC) , 2001, Human Genetics.

[15]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[16]  Wen-Hsiung Li,et al.  Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications , 2005, Journal of Molecular Evolution.

[17]  Zhongming Zhao,et al.  Investigating single nucleotide polymorphism (SNP) density in the human genome and its implications for molecular evolution. , 2003, Gene.

[18]  W S Watkins,et al.  Population genomics: a bridge from evolutionary history to genetic medicine. , 2001, Human molecular genetics.

[19]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[20]  Oscar Pastor,et al.  Conceptual Modeling of Human Genome Mutations - A Dichotomy Between what we Have and What we Should Have , 2010, BIOINFORMATICS.

[21]  Oscar Pastor,et al.  Conceptual Modeling Meets the Human Genome , 2008, ER.

[22]  Daniel Bayer,et al.  SNPtoGO: characterizing SNPs by enriched GO terms , 2008, Bioinform..

[23]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2006, Nucleic Acids Res..

[24]  Helen Pearson,et al.  Genetics: What is a gene? , 2006, Nature.

[25]  Henrik Kaessmann,et al.  DNA sequence variation in a non-coding region of low recombination on the human X chromosome , 1999, Nature Genetics.

[26]  Jürgen Jost,et al.  Gene and genon concept: coding versus regulation , 2007, Theory in Biosciences.

[27]  Zhongming Zhao,et al.  Neighboring-nucleotide effects on single nucleotide polymorphisms: a study of 2.6 million polymorphisms across the human genome. , 2002, Genome research.

[28]  L. Jin,et al.  Worldwide Dna Sequence Variation in a 10-kilobase Noncoding Region on Human Chromosome 22 Materials and Methods Dna Samples. Sixty-four Individuals Were Collected Worldwide from 16 Populations in Four Major Geographic Areas, including 20 , 2022 .

[29]  Alain Viari,et al.  Imagene: an integrated computer environment for sequence annotation and analysis , 1999, Bioinform..

[30]  Bruce Alberts,et al.  Essential Cell Biology , 1983 .

[31]  Takuro Tamura,et al.  Formal design and implementation of an improved DDBJ DNA database with a new schema and object-oriented library , 1998, Bioinform..

[32]  Csilla Szabo,et al.  The Breast Cancer Information Core: Database design, structure, and scope , 2000, Human mutation.

[33]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..

[34]  M. Gerstein,et al.  What is a gene, post-ENCODE? History and updated definition. , 2007, Genome research.

[35]  P. Stenson,et al.  The Human Gene Mutation Database: 2008 update , 2009, Genome Medicine.

[36]  I-Min A. Chen,et al.  Modeling scientific experiments with an object data model , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[37]  Bran Selic,et al.  The Pragmatics of Model-Driven Development , 2003, IEEE Softw..