In modern software engineering it is widely accepted that the use of Conceptual Modeling techniques provides an accurate description of the problem domain. Applying these techniques before developing their associated software representation (implementations) allows for the development of high quality software systems. The application of these ideas to new, challenging domains -as the one provided by the modern Genomics- is a fascinating task. In particular, this chapter shows how the complexity of human genome interpretation can be faced from a pure conceptual modeling perspective to describe and understand it more clearly and precisely. With that, we pretend to show that a conceptual schema of the human genome will allow us to better understand the functional and structural relations that exist between the genes and the DNA translation and transcription processes, intended to explain the protein synthesis. Genome, genes, alleles, genic mutations... all these concepts should be properly specified through the creation of the corresponding Conceptual Schema, and the result of these efforts is presented here. First, an initial conceptual schema is suggested. It includes a first version of the basic genomic notions intended to define those basic concepts that characterize the description of the Human Genome. A set of challenging concepts is detected: they refer to representations that require a more detailed specification. As the knowledge about the domain increases, the model evolution is properly introduced and justified, with the final intention of obtaining a stable, final version for the Conceptual Schema of the Human Genome. During this process, the more critical concepts are outlined, and the final decision adopted to model them adequately is discussed. Having such a Conceptual Schema enables the creation of a corresponding data base. This database could include the required contents needed to exploit bio-genomic information in the structured and precise way historically provided by the Database domains. That strategy is far from the current biological data source ontologies that are heterogeneous, imprecise and too often even inconsistent.
[1]
Han Min Wong,et al.
e-Fungi: a data resource for comparative analysis of fungal genomes
,
2007,
BMC Genomics.
[2]
Antoni Olivé,et al.
Conceptual modeling of information systems
,
2007
.
[3]
Oscar Pastor,et al.
Conceptual Modeling Meets the Human Genome
,
2008,
ER.
[4]
Sudha Ram,et al.
Toward Semantic Interoperability of Heterogeneous Biological Data Sources
,
2005,
CAiSE.
[5]
Christian S. Jensen,et al.
Capturing Temporal Constraints in Temporal ER Models
,
2008,
ER.
[6]
Oscar Pastor,et al.
Model-driven architecture in practice - a software production environment based on conceptual modeling
,
2007
.
[7]
V. Gamulin,et al.
Comparative genomic analysis of prion genes
,
2007,
BMC Genomics.
[8]
Carole A. Goble,et al.
Conceptual modelling of genomic information
,
2000,
Bioinform..
[9]
Norman W. Paton,et al.
Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it
,
2006,
BMC Bioinformatics.
[10]
Norman W. Paton,et al.
Conceptual data modelling for bioinformatics
,
2002,
Briefings Bioinform..
[11]
M. Gerstein,et al.
What is a gene, post-ENCODE? History and updated definition.
,
2007,
Genome research.
[12]
J. Lupski,et al.
The complete genome of an individual by massively parallel DNA sequencing
,
2008,
Nature.