论文信息 - REMOTE PROTEIN HOMOLOGY DETECTION USING HIDDEN MARKOV MODELS - 字舞流文

REMOTE PROTEIN HOMOLOGY DETECTION USING HIDDEN MARKOV MODELS

OF THE DISSERTATION Remote Protein Homology Detection Using Hidden Markov Models

Steven Johnson Rob Mitra Tim Schedl Jim Skeath Gar Stormo | S. Stormo

[1] J. Thompson,et al. Multiple sequence alignment with Clustal X. , 1998, Trends in biochemical sciences.

[2] Robert Tibshirani,et al. An Introduction to the Bootstrap , 1994 .

[3] William R. Taylor,et al. The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[4] Veronica Morea,et al. Sequence conservation in families whose members have little or no sequence similarity: the four-helical cytokines and cytochromes. , 2002, Journal of molecular biology.

[5] R Staden. Computer methods to locate signals in nucleic acid sequences , 1984, Nucleic Acids Res..

[6] Temple F. Smith,et al. The statistical distribution of nucleic acid similarities. , 1985, Nucleic acids research.

[7] I. Dodd,et al. Systematic method for the detection of potential lambda Cro-like DNA-binding regions in proteins. , 1987, Journal of molecular biology.

[8] P. Argos,et al. Weighting aligned protein or nucleic acid sequences to correct for unequal representation. , 1990, Journal of molecular biology.

[9] C. Orengo,et al. Protein families and their evolution-a structural perspective. , 2005, Annual review of biochemistry.

[10] R F Doolittle. Some reflections on the early days of sequence searching. , 1997, Journal of molecular medicine.

[11] Robert D. Finn,et al. Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins , 1999, Nucleic Acids Res..

[12] S. Henikoff,et al. Automated assembly of protein blocks for database searching. , 1991, Nucleic acids research.

[13] S. Karlin,et al. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[14] S F Altschul,et al. Weights for data related by a tree. , 1989, Journal of molecular biology.

[15] A. Dembo,et al. Limit Distribution of Maximal Non-Aligned Two-Sequence Segmental Score , 1994 .

[16] W. Pearson,et al. The limits of protein sequence comparison? , 2005, Current opinion in structural biology.

[17] C. Ponting,et al. On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? , 2001, Journal of structural biology.

[18] Thomas L. Madden,et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[19] Nick V Grishin,et al. Access the most recent version at doi: 10.1110/ps.03197403 References , 2003 .

[20] Terence Hwa,et al. Hybrid alignment: high-performance with universal statistics , 2002, Bioinform..

[21] A. D. McLachlan,et al. Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[22] W. Fitch. An improved method of testing for evolutionary homology. , 1966, Journal of molecular biology.

[23] S. Altschul,et al. Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[24] Patrice Koehl,et al. ASTRAL compendium enhancements , 2002, Nucleic Acids Res..

[25] Charlie Hodgman,et al. A historical perspective on gene/protein functional assignment , 2000, Bioinform..

[26] Nick V Grishin,et al. A tale of two ferredoxins: sequence similarity and structural differences , 2006 .

[27] R. Durbin,et al. Pfam: A comprehensive database of protein domain families based on seed alignments , 1997, Proteins.

[28] Bin Ma,et al. PatternHunter: faster and more sensitive homology search , 2002, Bioinform..

[29] Robert D. Finn,et al. The Pfam protein families database , 2004, Nucleic Acids Res..

[30] D. Haussler,et al. Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[31] Sydney Anne Cameron,et al. Molecular Evolution: A Phylogenetic Approach.—Roderic D. M. Page and Edward C. Holmes. , 2002 .

[32] Sean R. Eddy,et al. Pfam: multiple sequence alignments and HMM-profiles of protein domains , 1998, Nucleic Acids Res..

[33] Louxin Zhang,et al. Good spaced seeds for homology search , 2004, Bioinform..

[34] M. Madera,et al. A comparison of profile hidden Markov model procedures for remote homology detection. , 2002, Nucleic acids research.

[35] S. Henikoff,et al. Position-based sequence weights. , 1994, Journal of molecular biology.

[36] D. Haussler,et al. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. , 1998, Journal of molecular biology.

[37] D. Lipman,et al. Rapid and sensitive protein similarity searches. , 1985, Science.

[38] J. Kendrew,et al. The amino-acid sequence of sperm whale myoglobin. Comparison between the amino-acid sequences of sperm whale myoglobin and of human hemoglobin. , 1961, Nature.

[39] Martin Vingron,et al. A fast and sensitive multiple sequence alignment algorithm , 1989, Comput. Appl. Biosci..

[40] C. Chothia,et al. Volume changes in protein evolution. , 1994, Journal of molecular biology.

[41] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[42] Shmuel Pietrokovski,et al. The Blocks database--a system for protein classification , 1996, Nucleic Acids Res..

[43] M. O. Dayhoff,et al. Atlas of protein sequence and structure , 1965 .

[44] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[45] Nick V. Grishin,et al. Structural drift: a possible path to protein fold change , 2005, Bioinform..

[46] Russell F. Doolittle,et al. On the trail of protein sequences , 2000, Bioinform..

[47] Johannes Söding,et al. Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[48] Nebojsa Jojic,et al. Efficient approximations for learning phylogenetic HMM models from data , 2004, ISMB/ECCB.

[49] Chris Sander,et al. Removing near-neighbour redundancy from large protein sequence collections , 1998, Bioinform..

[50] C. Sander,et al. The FSSP database of structurally aligned protein fold families. , 1994, Nucleic acids research.

[51] G. Stormo. Consensus patterns in DNA. , 1990, Methods in enzymology.

[52] William Noble Grundy,et al. Family-based homology detection via pairwise sequence comparison , 1998, RECOMB '98.

[53] S. B. Needleman,et al. A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[54] S. Altschul. Amino acid substitution matrices from an information theoretic perspective , 1991, Journal of Molecular Biology.

[55] Elena Rivas,et al. Evolutionary models for insertions and deletions in a probabilistic modeling framework , 2005, BMC Bioinformatics.

[56] G. Mitchison. A Probabilistic Treatment of Phylogeny and Sequence Alignment , 1999, Journal of Molecular Evolution.

[57] Jeremy Buhler,et al. Designing multiple simultaneous seeds for DNA similarity search , 2004, J. Comput. Biol..

[58] S F Altschul,et al. Local alignment statistics. , 1996, Methods in enzymology.

[59] J. Thompson,et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[60] Amir Dembo,et al. Strong limit theorems of empirical functionals for large exceedances of partial sums of i , 1991 .

[61] C. Chothia,et al. Intermediate sequences increase the detection of homology between sequences. , 1997, Journal of molecular biology.

[62] T. D. Schneider,et al. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. , 1982, Nucleic acids research.

[63] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.

[64] Rod A Wing,et al. Sequence, annotation, and analysis of synteny between rice chromosome 3 and diverged grass species. , 2005, Genome research.

[65] R F Doolittle,et al. Simian sarcoma virus onc gene, v-sis, is derived from the gene (or genes) encoding a platelet-derived growth factor. , 1983, Science.

[66] Jeremy Buhler,et al. Choosing the best heuristic for seeded alignment of DNA sequences , 2006, BMC Bioinformatics.

[67] Bertil Schmidt,et al. Hyper customized processors for bio-sequence database scanning on FPGAs , 2005, FPGA '05.

[68] P. Schultz,et al. Comparative analysis of human genome assemblies reveals genome-level differences. , 2002, Genomics.

[69] Tim J. P. Hubbard,et al. SCOP: a Structural Classification of Proteins database , 1999, Nucleic Acids Res..

[70] Torbjørn Rognes,et al. Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors , 2000, Bioinform..

[71] W. J. Kent,et al. BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[72] Pat Hanrahan,et al. ClawHMMER: A Streaming HMMer-Search Implementation , 2005, SC.

[73] S. Henikoff,et al. Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[74] David Haussler,et al. Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology , 1996, Comput. Appl. Biosci..

[75] Patrick Crowley,et al. Exploiting coarse-grained parallelism to accelerate protein motif finding with a network processor , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[76] Kimmen Sjölander,et al. COACH : profile-profile alignment of protein families using hidden Markov models , 2003 .

[77] N. Grishin,et al. COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. , 2003, Journal of molecular biology.

[78] Richard Hughey,et al. Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[79] David Haussler,et al. Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis , 2004, J. Comput. Biol..

[80] K Karplus,et al. Predicting protein structure using only sequence information , 1999, Proteins.

[81] Richard Hughey,et al. Calibrating E-values for hidden Markov models using reverse-sequence null models , 2005, Bioinform..

[82] Y. Matsuo,et al. Exploration of novel motifs derived from mouse cDNA sequences. , 2002, Genome research.

[83] T. D. Schneider,et al. Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[84] Robert D. Finn,et al. Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[85] Patrice Koehl,et al. The ASTRAL compendium for protein structure and sequence analysis , 2000, Nucleic Acids Res..

[86] P. Arruda,et al. Collection for Tropical Crop Sugarcane Analysis and Functional Annotation of an Expressed Sequence Tag , 2006 .

[87] 김동규,et al. [서평]「Algorithms on Strings, Trees, and Sequences」 , 2000 .

[88] N. Grishin,et al. KH domain: one motif, two folds. , 2001, Nucleic acids research.

[89] D. Lipman,et al. Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.