SNP-Converter: An Ontology-Based Solution to Reconcile Heterogeneous SNP Descriptions for Pharmacogenomic Studies

Pharmacogenomics explores the impact of individual genomic variations in health problems such as adverse drug reactions. Records of millions of genomic variations, mostly known as Single Nucleotide Polymorphisms (SNP), are available today in various overlapping and heterogeneous databases. Selecting and extracting from these databases or from private sources a proper set of polymorphisms are the first steps of a KDD (Knowledge Discovery in Databases) process in pharmacogenomics. It is however a tedious task hampered by the heterogeneity of SNP nomenclatures and annotations. Standards for representing genomic variants have been proposed by the Human Genome Variation Society (HGVS). The SNP-Converter application is aimed at converting any SNP description into an HGVS-compliant pivot description and vice versa. Used in the frame of a knowledge system, the SNP-Converter application contributes as a wrapper to semantic data integration and enrichment.

[1]  Maurizio Lenzerini,et al.  Models for semantic interoperability in service-oriented architectures , 2005, IBM Syst. J..

[2]  Maurizio Lenzerini,et al.  Representing and Using Interschema Knowledge in Cooperative Information Systems , 1993, Int. J. Cooperative Inf. Syst..

[3]  K. Sirotkin,et al.  dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. , 1999, Genome research.

[4]  Amedeo Napoli,et al.  Knowledge-Based Selection of Association Rules for Text Mining , 2004, ECAI.

[5]  Haig H. Kazazian,et al.  Toward a Human Variome Project , 2005 .

[6]  J. D. den Dunnen,et al.  Standardizing mutation nomenclature: Why bother? , 2003, Human mutation.

[7]  Ian Horrocks,et al.  From SHIQ and RDF to OWL: the making of a Web Ontology Language , 2003, J. Web Semant..

[8]  D. Nickerson,et al.  Variation is the spice of life , 2001, Nature Genetics.

[9]  D. Fredman,et al.  HGVbase: a curated resource describing human DNA variation and phenotype relationships , 2004, Nucleic Acids Res..

[10]  Stefan Decker,et al.  Creating Semantic Web Contents with Protégé-2000 , 2001, IEEE Intell. Syst..

[11]  Bradley M. Hemminger,et al.  TAMAL: an integrated approach to choosing SNPs for genetic studies of human complex traits , 2006, Bioinform..

[12]  Sharon Marsh,et al.  SNP databases and pharmacogenetics: great start, but a long way to go , 2002, Human mutation.

[13]  Amedeo Napoli,et al.  SNP-Ontology for semantic integration of genomic variation data , 2006, ISMB 2006.

[14]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[15]  David Haussler,et al.  LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources , 2005, Bioinform..

[16]  Joshua M. Stuart,et al.  Integrating genotype and phenotype information: an overview of the PharmGKB project , 2001, The Pharmacogenomics Journal.

[17]  David A. Bell,et al.  The role of domain knowledge in data mining , 1995, CIKM '95.

[18]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[19]  Russ B. Altman,et al.  Ontology Development for a Pharmacogenetics Knowledge Base , 2001, Pacific Symposium on Biocomputing.

[20]  S. Antonarakis,et al.  Mutation nomenclature extensions and suggestions to describe complex mutations: A discussion , 2000 .

[21]  Kei-Hoi Cheung,et al.  YeastHub: a semantic web use case for integrating data in the life sciences domain , 2005, ISMB.

[22]  M. Relling,et al.  Moving towards individualized medicine with pharmacogenomics , 2004, Nature.

[23]  E. Birney,et al.  EnsMart: a generic system for fast and flexible access to biological data. , 2003, Genome research.