Functional Genomics

There have been steady improvements in protein structure prediction during the past two decades. However, current methods are still far from consistently predicting structural models accurately with computing power accessible to common users. To address this challenge, we developed MUFOLD, a hybrid method of using whole and partial template information along with new computational techniques for protein tertiary structure prediction. MUFOLD covers both template-based and ab initio predictions using the same framework and aims to achieve high accuracy and fast computing. Two major novel contributions of MUFOLD are graph-based model generation and molecular dynamics ranking (MDR). By formulating a prediction as a graph realization problem, we apply an efficient optimization approach of Multidimensional Scaling (MDS) to speed up the prediction dramatically. In addition, under this framework, we enhance the predictions consistently by iteratively using the information from generated models. MDR, in contrast to widely used static scoring functions, exploits dynamics properties of structures to evaluate their qualities, which can often identify best structures from a pool more effectively.

[1]  R. Huber,et al.  Crystal structure determination, refinement and molecular model of creatine amidinohydrolase from Pseudomonas putida. , 1988, Journal of molecular biology.

[2]  M. N. Ponnuswamy,et al.  Enzymology and folding of natural and engineered bacterial beta-glucanases studied by X-ray crystallography. , 1996, Biological chemistry.

[3]  James L. Winkler,et al.  Accessing Genetic Information with High-Density DNA Arrays , 1996, Science.

[4]  J M Thornton,et al.  Derivation of 3D coordinate templates for searching structural databases: Application to ser‐His‐Asp catalytic triads in the serine proteinases and lipases , 1996, Protein science : a publication of the Protein Society.

[5]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[6]  G. Schneider,et al.  Circular permutations of natural protein sequences: structural evidence. , 1997, Current opinion in structural biology.

[7]  Thomas Madej,et al.  Surprising similarities in structure comparison. Review article , 1997 .

[8]  C. Nusbaum,et al.  Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. , 1998, Science.

[9]  ECOLI SODF,et al.  Analogous Enzymes : Independent Inventions in Enzyme Evolution , 1998 .

[10]  C. Raetz,et al.  The active site of Escherichia coli UDP-N-acetylglucosamine acyltransferase. Chemical modification and site-directed mutagenesis. , 1999, The Journal of biological chemistry.

[11]  P. Bork Powers and pitfalls in sequence analysis: the 70% hurdle. , 2000, Genome research.

[12]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[13]  D. Eisenberg,et al.  Protein function in the post-genomic era , 2000, Nature.

[14]  I. Sase,et al.  Double-labeled donor probe can enhance the signal of fluorescence resonance energy transfer (FRET) in detection of nucleic acid hybridization. , 2000, Nucleic acids research.

[15]  Shmuel Pietrokovski,et al.  Increased coverage of protein families with the Blocks Database servers , 2000, Nucleic Acids Res..

[16]  S. Teichmann,et al.  Domain combinations in archaeal, eubacterial and eukaryotic proteomes. , 2001, Journal of molecular biology.

[17]  J. Skolnick,et al.  Enhanced functional annotation of protein sequences via the use of structural descriptors. , 2001, Journal of structural biology.

[18]  F. Regnier,et al.  Fractionation of isotopically labeled peptides in quantitative proteomics. , 2001, Analytical chemistry.

[19]  W. Stahel,et al.  Log-normal Distributions across the Sciences: Keys and Clues , 2001 .

[20]  T. Speed,et al.  Design issues for cDNA microarray experiments , 2002, Nature Reviews Genetics.

[21]  B. Rost Enzyme function less conserved than anticipated. , 2002, Journal of molecular biology.

[22]  J. X. Pang,et al.  Biomarker discovery in urine by proteomics. , 2002, Journal of proteome research.

[23]  Michael Ashburner,et al.  On ontologies for biologists: the Gene Ontology--untangling the web. , 2002, Novartis Foundation symposium.

[24]  Y. Tu,et al.  Quantitative noise analysis for gene expression microarray experiments , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[26]  B. Honig,et al.  Structural genomics: Computational methods for structure analysis , 2003, Protein science : a publication of the Protein Society.

[27]  C. Chothia,et al.  Evolution of the Protein Repertoire , 2003, Science.

[28]  Terri K. Attwood,et al.  PRINTS and its automatic supplement, prePRINTS , 2003, Nucleic Acids Res..

[29]  Andrew D Ellington,et al.  In vitro selection of molecular beacons. , 2003, Nucleic acids research.

[30]  B. Rost,et al.  Automatic prediction of protein function , 2003, Cellular and Molecular Life Sciences CMLS.

[31]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[32]  Adam Godzik,et al.  FATCAT: a web server for flexible structure comparison and structure similarity searching , 2004, Nucleic Acids Res..

[33]  J. Thornton,et al.  Searching for functional sites in protein structures. , 2004, Current opinion in chemical biology.

[34]  Janet M. Thornton,et al.  From protein structure to biochemical function? , 2004, Journal of Structural and Functional Genomics.

[35]  Cynthia L. Smith,et al.  The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information , 2004, Genome Biology.

[36]  J. Thornton,et al.  Predicting protein function from sequence and structural data. , 2005, Current opinion in structural biology.

[37]  C. Ouzounis,et al.  Percolation of annotation errors through hierarchically structured protein sequence databases. , 2005, Mathematical biosciences.

[38]  N. Logsdon,et al.  Same structure, different function crystal structure of the Epstein-Barr virus IL-10 bound to the soluble IL-10R1 chain. , 2005, Structure.

[39]  R. Dutton,et al.  Biological shot-noise and quantum-limited signal-to-noise ratio in affinity-based biosensors , 2005 .

[40]  Michael K. Coleman,et al.  Correlation of relative abundance ratios derived from peptide ion chromatograms and spectrum counting for quantitative proteomic analysis using stable isotope labeling. , 2005, Analytical chemistry.

[41]  Jean-Michel Claverie,et al.  Phydbac "Gene Function Predictor" : a gene annotation tool based on genomic context analysis , 2005, BMC Bioinformatics.

[42]  Jan Griebsch,et al.  PAST: fast structure-based searching in the PDB , 2006, Nucleic Acids Res..

[43]  N. Samatova,et al.  Detecting differential and correlated protein expression in label-free shotgun proteomics. , 2006, Journal of proteome research.

[44]  Iddo Friedberg,et al.  Automated protein function predictionçthe genomic challenge , 2006 .

[45]  C. Vogel,et al.  Duplication, divergence and formation of novel protein topologies. , 2006, BioEssays : news and reviews in molecular, cellular and developmental biology.

[46]  Michael K. Coleman,et al.  Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. , 2006, Journal of proteome research.

[47]  Michael K. Coleman,et al.  Analyzing chromatin remodeling complexes using shotgun proteomics and normalized spectral abundance factors. , 2006, Methods.

[48]  Ali Hajimiri,et al.  On noise processes and limits of performance in biosensors , 2007 .

[49]  A. Godzik,et al.  Computational protein function prediction: Are we making progress? , 2007, Cellular and Molecular Life Sciences.

[50]  Tao Xu,et al.  Quantitative Mass Spectrometry Identifies Insulin Signaling Targets in C. elegans , 2007, Science.

[51]  Ron Orlando,et al.  Up-regulation of NG2 proteoglycan and interferon-induced transmembrane proteins 1 and 3 in mouse astrocytoma: a membrane proteomics approach. , 2008, Cancer letters.

[52]  H. Christofk,et al.  A label‐free quantification method by MS/MS TIC compared to SILAC and spectral counting in a proteomics screen , 2008, Proteomics.

[53]  Junliang Pan,et al.  Comparative Proteomic Analysis of Non-small-cell Lung Cancer and Normal Controls Using Serum Label-Free Quantitative Shotgun Technology , 2008, Lung.

[54]  Kazuyuki Aihara,et al.  Protein function prediction with high-throughput data , 2008, Amino Acids.

[55]  Christian J. A. Sigrist,et al.  Nucleic Acids Research Advance Access published November 14, 2007 The 20 years of PROSITE , 2007 .

[56]  Minoru Kanehisa,et al.  Domain shuffling and the evolution of vertebrates. , 2009, Genome research.

[57]  Surendra Dasari,et al.  Proteomic identification of salivary biomarkers of type-2 diabetes. , 2009, Journal of proteome research.

[58]  Babak Hassibi,et al.  Real-time DNA microarray analysis , 2009, Nucleic acids research.

[59]  Chi-Ching Lee,et al.  CPDB: a database of circular permutation in proteins , 2008, Nucleic Acids Res..

[60]  Klaus Heumann,et al.  Semantic data integration and knowledge management to represent biological network associations. , 2009, Methods in molecular biology.