The Construction and Use of Log-Odds Substitution Scores for Multiple Sequence Alignment
暂无分享,去创建一个
John C. Wootton | Stephen F. Altschul | Yi-Kuo Yu | Elena Zaslavsky | S. Altschul | J. Wootton | E. Zaslavsky | Yi-Kuo Yu
[1] Osamu Gotoh,et al. A weighting system and algorithm for aligning many phylogenetically related sequences , 1995, Comput. Appl. Biosci..
[2] Kenta Nakai,et al. Pseudocounts for transcription factor binding sites , 2008, Nucleic acids research.
[3] István Miklós,et al. Bayesian coestimation of phylogeny and sequence alignment , 2005, BMC Bioinformatics.
[4] Duncan P. Brown,et al. Automated Protein Subfamily Identification and Classification , 2007, PLoS Comput. Biol..
[5] Byungkook Lee,et al. Frequency of gaps observed in a structurally aligned protein pair database suggests a simple gap penalty function. , 2004, Nucleic acids research.
[6] Michael Kaufmann,et al. BMC Bioinformatics BioMed Central , 2005 .
[7] Jun S. Liu,et al. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.
[8] S. Sunyaev,et al. PSIC: profile extraction from sequence alignments with position-specific counts of independent observations. , 1999, Protein engineering.
[9] M Vingron,et al. Weighting in sequence space: a comparison of methods in terms of generalized sequences. , 1993, Proceedings of the National Academy of Sciences of the United States of America.
[10] Sean R. Eddy,et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .
[11] O. Gotoh. An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.
[12] P. Argos,et al. Weighting aligned protein or nucleic acid sequences to correct for unequal representation. , 1990, Journal of molecular biology.
[13] Richard Mott. Local sequence alignments with monotonic gap penalties , 1999, Bioinform..
[14] Jimin Pei,et al. PCMA: fast and accurate multiple sequence alignment based on profile consistency , 2003, Bioinform..
[15] S. Henikoff,et al. Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.
[16] Anders Krogh,et al. Maximum Entropy Weighting of Aligned Sequences of Proteins or DNA , 1995, ISMB.
[17] T. D. Schneider,et al. Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.
[18] John P. Overington,et al. Environment‐specific amino acid substitution tables: Tertiary templates and prediction of protein folds , 1992, Protein science : a publication of the Protein Society.
[19] David Haussler,et al. Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology , 1996, Comput. Appl. Biosci..
[20] L. Holm,et al. The Pfam protein families database , 2005, Nucleic Acids Res..
[21] S. Altschul. Amino acid substitution matrices from an information theoretic perspective , 1991, Journal of Molecular Biology.
[22] A. Godzik,et al. Comparison of sequence profiles. Strategies for structural predictions using sequence information , 2008, Protein science : a publication of the Protein Society.
[23] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[24] A. Fersht,et al. Glutamine, alanine or glycine repeats inserted into the loop of a protein have minimal effects on stability and folding rates. , 1997, Journal of molecular biology.
[25] Sven Rahmann,et al. Non-symmetric score matrices and the detection of homologous transmembrane proteins , 2001, ISMB.
[26] G. Gonnet,et al. Empirical and structural models for insertions and deletions in the divergent evolution of proteins. , 1993, Journal of molecular biology.
[27] Francesca Chiaromonte,et al. Scoring Pairwise Genomic Sequence Alignments , 2001, Pacific Symposium on Biocomputing.
[28] D. Sankoff. Minimal Mutation Trees of Sequences , 1975 .
[29] Sean R. Eddy,et al. Maximum Discrimination Hidden Markov Models of Sequence Consensus , 1995, J. Comput. Biol..
[30] Robert C. Edgar,et al. MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.
[31] M. A. McClure,et al. Hidden Markov models of biological primary sequence information. , 1994, Proceedings of the National Academy of Sciences of the United States of America.
[32] David Sankoff,et al. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .
[33] Benjamin J. Raphael,et al. A novel method for multiple alignment of sequences with repeated and shuffled elements. , 2004, Genome research.
[34] Mark Gerstein,et al. Changes in Protein Evolution Appendix : A method to weight protein sequences to correct for unequal representation , 1999 .
[35] Kimmen Sjölander,et al. A comparison of scoring functions for protein sequence profile alignment , 2004, Bioinform..
[36] M Kann,et al. Optimization of a new score function for the detection of remote homologs , 2000, Proteins.
[37] J. Felsenstein,et al. Inching toward reality: An improved likelihood model of sequence evolution , 2004, Journal of Molecular Evolution.
[38] N. Grishin,et al. COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. , 2003, Journal of molecular biology.
[39] C. Chothia,et al. Volume changes in protein evolution. , 1994, Journal of molecular biology.
[40] R. Doolittle,et al. Aligning amino acid sequences: Comparison of commonly used methods , 1985, Journal of Molecular Evolution.
[41] Narmada Thanki,et al. CDD: specific functional annotation with the Conserved Domain Database , 2008, Nucleic Acids Res..
[42] Richard Hughey,et al. Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..
[43] Kimmen Sjölander,et al. SATCHMO: Sequence Alignment and Tree Construction Using Hidden Markov Models , 2003, Bioinform..
[44] M. O. Dayhoff,et al. Atlas of protein sequence and structure , 1965 .
[45] D. Lipman,et al. Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.
[46] S. Pietrokovski. Searching databases of conserved sequence regions by aligning protein multiple-alignments. , 1996, Nucleic acids research.
[47] Adam M. Novak,et al. BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC , 2009, BMC Evolutionary Biology.
[48] Sean R. Eddy,et al. Pfam: multiple sequence alignments and HMM-profiles of protein domains , 1998, Nucleic Acids Res..
[49] H. Jeffreys. An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.
[50] Eric P Xing,et al. MotifPrototyper: A Bayesian profile model for motif families , 2004, Proc. Natl. Acad. Sci. USA.
[51] Julie Dawn Thompson,et al. Improved sensitivity of profile searches through the use of sequence weights and gap excision , 1994, Comput. Appl. Biosci..
[52] S. Altschul. Gap costs for multiple sequence alignment. , 1989, Journal of theoretical biology.
[53] Adam Prügel-Bennett,et al. Training HMM structure with genetic algorithm for biological sequence analysis , 2004, Bioinform..
[54] A. Dembo,et al. Limit Distribution of Maximal Non-Aligned Two-Sequence Segmental Score , 1994 .
[55] Tu Minh Phuong,et al. Multiple alignment of protein sequences with repeats and rearrangements , 2006, Nucleic acids research.
[56] Duncan P. Brown,et al. Efficient functional clustering of protein sequences using the Dirichlet process , 2008, Bioinform..
[57] Jorja G. Henikoff,et al. PHAT: a transmembrane-specific substitution matrix , 2000, Bioinform..
[58] Kimmen Sjölander,et al. Phylogenetic Inference in Protein Superfamilies: Analysis of SH2 Domains , 1998, ISMB.
[59] D. Haussler,et al. Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.
[60] Durbin,et al. Biological Sequence Analysis , 1998 .
[61] J. Richardson,et al. Simultaneous comparison of three protein sequences. , 1985, Proceedings of the National Academy of Sciences of the United States of America.
[62] Anders Krogh,et al. Modeling promoter grammars with evolving hidden Markov models , 2008, Bioinform..
[63] Jun S. Liu,et al. Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model , 2004, BMC Bioinformatics.
[64] S. Karlin,et al. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.
[65] David Haussler,et al. Using Dirichlet Mixture Priors to Derive Hidden Markov Models for Protein Families , 1993, ISMB.
[66] Richa Agarwala,et al. COBALT: constraint-based alignment tool for multiple protein sequences , 2007, Bioinform..
[67] C. Sander,et al. Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.
[68] J. Felsenstein,et al. An evolutionary model for maximum likelihood alignment of DNA sequences , 1991, Journal of Molecular Evolution.
[69] R. Durbin,et al. Pfam: A comprehensive database of protein domain families based on seed alignments , 1997, Proteins.
[70] M. O. Dayhoff,et al. 22 A Model of Evolutionary Change in Proteins , 1978 .
[71] Olivier Poch,et al. BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..
[72] S. Altschul,et al. Optimal sequence alignment using affine gap costs. , 1986, Bulletin of mathematical biology.
[73] Anna R Panchenko,et al. Finding weak similarities between proteins by sequence profile comparison. , 2003, Nucleic acids research.
[74] K. Karrer,et al. Homing Endonucleases Encoded by Germ Line-Limited Genes in Tetrahymena thermophila Have APETELA2 DNA Binding Domains , 2004, Eukaryotic Cell.
[75] Lior Pachter,et al. Fast Statistical Alignment , 2009, PLoS Comput. Biol..
[76] P. Sellers. Pattern recognition in genetic sequences by mismatch density , 1984 .
[77] G. Gonnet,et al. Exhaustive matching of the entire protein sequence database. , 1992, Science.
[78] S. Henikoff,et al. Position-based sequence weights. , 1994, Journal of molecular biology.
[79] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[80] Akihiko Konagaya,et al. Hidden Markov Models and Iterative Aligners: Study of Their Equivalence and Possibilities , 1993, ISMB.
[81] Folker Meyer,et al. Rose: generating sequence families , 1998, Bioinform..
[82] S. Altschul,et al. Improved Sensitivity of Nucleic Acid Database Searches Using Application-Specific Scoring Matrices , 1991 .
[83] J. Risler,et al. Amino acid substitutions in structurally related proteins. A pattern recognition approach. Determination of a new and efficient scoring matrix. , 1988, Journal of molecular biology.
[84] SödingJohannes. Protein homology detection by HMM--HMM comparison , 2005 .
[85] X Zhang,et al. Stochastic heuristic algorithms for target motif identification (extended abstract). , 2000, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.
[86] M. Madan Babu,et al. Discovery of the principal specific transcription factors of Apicomplexa and their implication for the evolution of the AP2-integrase DNA binding domains , 2005, Nucleic acids research.
[87] Michael Kaufmann,et al. DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment , 2008, Algorithms for Molecular Biology.
[88] Alejandro A. Schäffer,et al. PSI-BLAST pseudocounts and the minimum description length principle , 2008, Nucleic acids research.
[89] Roland L Dunbrack,et al. Scoring profile‐to‐profile sequence alignments , 2004, Protein science : a publication of the Protein Society.
[90] Andrew R. Gehrke,et al. Specific DNA-binding by Apicomplexan AP2 transcription factors , 2008, Proceedings of the National Academy of Sciences.
[91] William R. Taylor,et al. The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..
[92] Martin Tompa,et al. An algorithm for finding novel gapped motifs in DNA sequences , 1998, RECOMB '98.
[93] M S Waterman,et al. Sequence alignment and penalty choice. Review of concepts, case studies and implications. , 1994, Journal of molecular biology.
[94] E. Myers,et al. Sequence comparison with concave weighting functions. , 1988, Bulletin of mathematical biology.
[95] R. Doolittle,et al. Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.
[96] J. Thompson,et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.
[97] Golan Yona,et al. Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. , 2002, Journal of molecular biology.
[98] T Yada,et al. Extraction of hidden Markov model representations of signal patterns in DNA sequences. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.
[99] Robert C. Edgar,et al. MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.
[100] Masao Yuda,et al. Identification of a transcription factor in the mosquito‐invasive stage of malaria parasites , 2009, Molecular microbiology.
[101] J. Mohana Rao. New scoring matrix for amino acid residue exchanges based on residue characteristic physical parameters. , 1987, International journal of peptide and protein research.
[102] Michael Gribskov,et al. The Megaprior Heuristic for Discovering Protein Sequence Patterns , 1996, ISMB.
[103] S. Bryant,et al. The identification of complete domains within protein sequences using accurate E-values for semi-global alignment , 2007, Nucleic acids research.
[104] T. Smith,et al. Optimal sequence alignments. , 1983, Proceedings of the National Academy of Sciences of the United States of America.
[105] Sarah Hake,et al. From Endonucleases to Transcription Factors: Evolution of the AP2 DNA Binding Domain in Plantsw⃞ , 2004, The Plant Cell Online.
[106] S. Altschul,et al. The compositional adjustment of amino acid substitution matrices , 2003, Proceedings of the National Academy of Sciences of the United States of America.
[107] Sean R. Eddy,et al. Profile hidden Markov models , 1998, Bioinform..
[108] Chuong B. Do,et al. ProbCons: Probabilistic consistency-based multiple sequence alignment. , 2005, Genome research.
[109] Manuel Llinás,et al. Structural determinants of DNA binding by a P. falciparum ApiAP2 transcriptional regulator. , 2010, Journal of molecular biology.
[110] S F Altschul,et al. Weights for data related by a tree. , 1989, Journal of molecular biology.
[111] W. Taylor,et al. The classification of amino acid conservation. , 1986, Journal of theoretical biology.
[112] Jorma Rissanen,et al. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[113] W. A. Beyer,et al. Some Biological Sequence Metrics , 1976 .
[114] Byungkook Lee,et al. Context‐specific amino acid substitution matrices and their use in the detection of protein homologs , 2008, Proteins.
[115] John C. Wootton,et al. Non-globular Domains in Protein Sequences: Automated Segmentation Using Complexity Measures , 1994, Comput. Chem..
[116] Masashi Suzuki,et al. A novel mode of DNA recognition by a β‐sheet revealed by the solution structure of the GCC‐box binding domain in complex with DNA , 1998, The EMBO journal.
[117] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[118] T. D. Schneider,et al. Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.
[119] David Baker,et al. Low free energy cost of very long loop insertions in proteins , 2003, Protein science : a publication of the Protein Society.
[120] Stephen F. Altschul,et al. The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions , 2005, Bioinform..