Engineering in genomics

This work addresses two principles that will be integral to the post-genomic or proteomic era (i.e., after sequencing). The first is that any analysis of data from or related to the Human Genome Project will need to be designed with high-throughput in mind. Just the sequence information will encompass some 3 billion nucleotides, and that does not include information about introns, exons, promoters, and many other features of interest. The volume of information that must be synthesized is even larger than the genome itself, and it is diverse in nature. It includes sequence, structural, functional, and localization information for each gene, and each of those constituents has its own levels of organization as well (e.g., functional information for a protein can be obtained at the molecular, cellular, and organismal levels). Computational analysis must be able to handle all these data in a reasonable amount of time. The second principle, which has been alluded to here, is that analysis techniques must incorporate data from a variety of sources. Archiving and indexing of sequence data, for example, must include sequences from multiple organisms and from diseased and healthy states to be maximally useful. The other levels of information, including structure, function, and localization will need to be similarly organized.

[1]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[2]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[3]  J M Thornton,et al.  NMR and crystallography--complementary approaches to structure determination. , 1994, Trends in biotechnology.

[4]  P Bork,et al.  Wanted: subcellular localization of proteins based on sequence. , 1998, Trends in cell biology.

[5]  D. Bowtell,et al.  Options available — from start to finish — for obtaining expression data by microarray , 1999, Nature Genetics.

[6]  M. Kanehisa A database for post-genome analysis. , 1997, Trends in genetics : TIG.

[7]  Axel T. Brunger,et al.  X-ray crystallography and NMR reveal complementary views of structure and dynamics. , 1997 .

[8]  S. Altschul,et al.  Issues in searching molecular sequence databases , 1994, Nature Genetics.

[9]  M V Boland,et al.  Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images. , 1998, Cytometry.

[10]  A. Linstedt,et al.  The Golgi and endoplasmic reticulum remain independent during mitosis in HeLa cells. , 1998, Molecular biology of the cell.

[11]  K. Wüthrich Protein structure determination in solution by NMR spectroscopy. , 1990, The Journal of biological chemistry.

[12]  David L. Woodruff,et al.  Identification of Outliers in Multivariate Data , 1996 .

[13]  Yan P. Yuan,et al.  Predicting function: from genes to genomes and back. , 1998, Journal of molecular biology.

[14]  P. Bork,et al.  Predicting functions from protein sequences—where are the bottlenecks? , 1998, Nature Genetics.

[15]  David C. Jones,et al.  Progress in protein structure prediction. , 1997, Current opinion in structural biology.

[16]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[17]  Alireza Khotanzad,et al.  Rotation invariant image recognition using features selected via a systematic method , 1990, Pattern Recognit..

[18]  M. Eisen,et al.  Gene expression informatics —it's all in your mine , 1999, Nature Genetics.

[19]  G. Wagner,et al.  An account of NMR in structural biology. , 1997, Nature structural biology.

[20]  M V Boland,et al.  Toward objective selection of representative microscope images. , 1999, Biophysical journal.

[21]  A V Finkelstein,et al.  Protein structure: what is it possible to predict now? , 1997, Current opinion in structural biology.