Issues in developing integrated genomic databases and application to the human X chromosome

MOTIVATION In the past decade, a vast amount of mapping data has been generated on the human X chromosome, without a mechanism which would provide a global view of exactly what has been achieved. Large datasets are available electronically, but in heterogeneous formats and with incompatible access modes. In addition, relationships between objects in different datasets are often not specified. RESULTS We discuss the problem of integrating these data into one database and define a number of requirements that are vital for any integration approach. We have developed IXDB, the Integrated X chromosome database, which fulfils those requirements and aims at providing a global view on genomic data at a chromosomal level. IXDB represents a conceptual framework based on identifying, storing and analysing relationships between biological objects, and includes a series of tools to automate the integration of such information. It currently focuses on physical mapping data, as a starting point towards a map of the human X chromosome that should provide a uniform and global research resource for ongoing and future sequencing and functional studies. AVAILABILITY IXDB is available at http://ixdb.mpimg-berlin-dahlem.mpg.de. The iace2ixdb software and a description of the Iace data format are available from the authors. CONTACT hrc@genoscope.cns.fr

[1]  C. Batini,et al.  A comparative analysis of methodologies for database schema integration , 1986, CSUR.

[2]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[3]  Ali R. Hurson,et al.  A taxonomy and current issues in multidatabase systems , 1992, Computer.

[4]  M. Boguski,et al.  dbEST — database for “expressed sequence tags” , 1993, Nature Genetics.

[5]  Hans Lehrach,et al.  The Reference Library System — sharing biological material and experimental data , 1994, Nature.

[6]  John K. Ousterhout,et al.  Tcl and the Tk Toolkit , 1994 .

[7]  O Ritter,et al.  Prototype implementation of the integrated genomic database. , 1994, Computers and biomedical research, an international journal.

[8]  P. Rigault,et al.  A YAC contig map of the human genome. , 1995, Nature.

[9]  Surajit Chaudhuri,et al.  Maintenance of Materialized Views: Problems, Techniques, and Applications. , 1995 .

[10]  T. Bech-Hansen Report of the Sixth International Workshop on Human X Chromosome Mapping 1995 , 1995 .

[11]  Jennifer Widom,et al.  Research problems in data warehousing , 1995, CIKM '95.

[12]  A. Monaco,et al.  Report of the sixth international workshop on X chromosome mapping 1995 , 1995 .

[13]  L Kruglyak,et al.  An STS-Based Map of the Human Genome , 1995, Science.

[14]  A. Monaco,et al.  An integrated YAC map of the human X chromosome. , 1996, Genome research.

[15]  Cécile Fizames,et al.  A comprehensive genetic map of the human genome based on 5,264 microsatellites , 1996, Nature.

[16]  P. Deloukas,et al.  A Gene Map of the Human Genome , 1996, Science.

[17]  Karl Aberer,et al.  A methodology for building a data warehouse in a scientific environment , 1996, Proceedings First IFCIS International Conference on Cooperative Information Systems.

[18]  Limsoon Wong,et al.  BioKleisli: a digital library for biomedical researchers , 1997, International Journal on Digital Libraries.

[19]  Perry L. Miller,et al.  Using explicitly represented biological relationships for database navigation and searching via the World-Wide Web , 1997, Comput. Appl. Biosci..

[20]  R. Mazzarella,et al.  X chromosome map at 75-kb STS resolution, revealing extremes of recombination and GC content. , 1997, Genome research.

[21]  Stanley Letovsky,et al.  The GDB Human Genome Database Anno 1997 , 1997, Nucleic Acids Res..

[22]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[23]  Yoshio Tateno,et al.  DNA Data Bank of Japan in the age of information biology , 1997, Nucleic Acids Res..

[24]  Ulf Leser,et al.  IXDB, an X chromosome integrated database , 1998, Nucleic Acids Res..

[25]  A Grigoriev Reusable graphical interface to genome information resources. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[26]  Rodrigo Lopez,et al.  The EMBL Nucleotide Sequence Database , 1999, Nucleic Acids Res..

[27]  Acknowledgements , 2018, Acknowledgements.