Facing the Challenges of Genome Information Systems: a Variation Analysis Prototype

In Bioinformatics there is a lack of software tools that fit with the requirements demanded by biologists. For instance, when a DNA sample is sequenced, a lot of work have to be performed manually and several tools are used. The application of Information Systems (IS) principles into the development of bioinformatics tools opens a new interesting research path. One of the most promising approaches is the use of conceptual models in order to precisely define how genomic data is represented into an IS. This work introduces how to build a Genome Information System (GIS) using these principles. As a first step to achieve this goal, a conceptual model to formally describe genomic mutations is presented. In addition, as a proof of concept of this approach, a variation analysis prototype has been implemented using this conceptual model as a development core.

[1]  Sudha Ram,et al.  Toward Semantic Interoperability of Heterogeneous Biological Data Sources , 2005, CAiSE.

[2]  R. Klein,et al.  Power analysis for genome-wide association studies , 2007, BMC Genetics.

[3]  Yike Guo,et al.  Yike Guo and Jonathan Sheldon of InforSense discuss the impact of workflow technology on drug discovery. Interview by Christopher Watson. , 2005, Drug discovery today.

[4]  F. Collins,et al.  A vision for the future of genomics research , 2003, Nature.

[5]  Donald G. Gilbert,et al.  euGenes: a eukaryote genome information system , 2002, Nucleic Acids Res..

[6]  Carole A. Goble,et al.  A classification of tasks in bioinformatics , 2001, Bioinform..

[7]  Abhishek Tiwari,et al.  Workflow based framework for life science informatics , 2007, Comput. Biol. Chem..

[8]  Carole A. Goble,et al.  Conceptual modelling of genomic information , 2000, Bioinform..

[9]  Oscar Pastor,et al.  Enforcing Conceptual Modeling to improve the understanding of human genome , 2010, 2010 Fourth International Conference on Research Challenges in Information Science (RCIS).

[10]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[11]  S. Antonarakis,et al.  Nomenclature for the description of human sequence variations , 2001, Human Genetics.

[12]  Toshio Kojima,et al.  The phenotype and genotype experiment object model (PaGE‐OM): a robust data structure for information related to DNA variation , 2009, Human mutation.

[13]  Tao Xu,et al.  Pegasys: software for executing and integrating analyses of biological sequences , 2004, BMC Bioinformatics.

[14]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[15]  Alain Viari,et al.  Imagene: an integrated computer environment for sequence annotation and analysis , 1999, Bioinform..

[16]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[17]  James P. Turley,et al.  Conceptual models: Definitions, construction, and applications in public health surveillance , 2006, Journal of Urban Health.

[18]  Carole A. Goble,et al.  TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources , 1998, ISMB.

[19]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[20]  E. Jordan,et al.  The Human Genome Project: where did it come from, where is it going? , 1992, American journal of human genetics.

[21]  F. Crick,et al.  Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid , 1953, Nature.

[22]  David Rogers,et al.  Cheminformatics analysis and learning in a data pipelining environment , 2006, Molecular Diversity.

[23]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.