Visualization challenges for a new cyber-pharmaceutical computing paradigm

Celera has encountered a number of visualization problems in the course of developing tools for bioinformatics research, applying them to our data generation efforts, and making that data available to our customers. This paper presents several examples from Celera's experience. In the area of genomics, challenging visualization problems have come up in assembling genomes, studying variations between individuals, and comparing different genomes to one another. The emerging area of proteomics has created new visualization challenges in interpreting protein expression data, studying protein regulatory networks, and examining protein structure. These examples illustrate how the field of bioinformatics is posing new challenges concerning the communication of data that are often very different from those that have heretofore dominated scientific computing. Addressing the level of detail, the degree of complexity, and the interdisciplinary barriers that characterize bioinformatic problems can be expected to be a sizable but rewarding task for the field of scientific visualization.

[1]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[2]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[3]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[4]  H Hilbert Sequence and analysis of the genome of the bacterium Mycoplasma pneumoniae. , 1995 .

[5]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[6]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[7]  N. Pattabiraman,et al.  All-atom models for the non-nucleoside binding site of HIV-1 reverse transcriptase complexed with inhibitors: a 3D QSAR approach. , 1996, Journal of medicinal chemistry.

[8]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[9]  L. Wodicka,et al.  Genome-wide expression monitoring in Saccharomyces cerevisiae , 1997, Nature Biotechnology.

[10]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[11]  M Schena,et al.  Microarrays: biotechnology's discovery platform for functional genomics. , 1998, Trends in biotechnology.

[12]  F. Collins,et al.  New goals for the U.S. Human Genome Project: 1998-2003. , 1998, Science.

[13]  G. Rubin,et al.  A computer program for aligning a cDNA sequence with a genomic DNA sequence. , 1998, Genome research.

[14]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[15]  N. Risch Searching for genetic determinants in the new millennium , 2000, Nature.

[16]  G. D. Wilson,et al.  An SNP map of human chromosome 22 , 2000, Nature.

[17]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.

[18]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[19]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[20]  J. Venter,et al.  Sequencing the entire genomes of free-living organisms: the foundation of pharmacology in the new millennium. , 2000, Annual review of pharmacology and toxicology.

[21]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[22]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[23]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[24]  Eugene W. Myers,et al.  Comparing Assemblies Using Fragments and Mate-Pairs , 2001, WABI.

[25]  Eugene W. Myers,et al.  The greedy path-merging algorithm for sequence assembly , 2001, RECOMB.

[26]  R. Aebersold,et al.  Mass spectrometry in proteomics. , 2001, Chemical reviews.

[27]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.