论文信息 - Finding protein domain boundaries: an automated, non-homology-based method

Finding protein domain boundaries: an automated, non-homology-based method

A sequence-based methodology identifies the boundaries of structural domains in proteins. The method doesn't depend on knowledge of a protein's structure or on sequence homologs. We developed a Bayesian approach based on the statistical analysis of word content used in other fields. Our method first catalogs "pattern" frequencies - occurrences of groups of amino acids - in a nonredundant database of known protein domains and then uses the distributions of these patterns to identify regions of protein sequence that appear to signal the beginnings and ends of domains. The domain-delineating signals we've produced using amino acid patterns show great promise in providing further insight into both the biochemistry and structural biology of proteins.

Brian M. Gurbaxani | Parag Mallick | P. Mallick | B. Gurbaxani

[1] G. W. Hatfield,et al. Nonrandom utilization of codon pairs in Escherichia coli. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[2] David C. Jones,et al. CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[3] Jérôme Gouzy,et al. Recent improvements of the ProDom database of protein domain families , 1999, Nucleic Acids Res..

[4] David T. Jones,et al. Rapid protein domain assignment from amino acid sequence using predicted secondary structure , 2002, Protein science : a publication of the Protein Society.

[5] R. A. George,et al. Snapdragon: a Method to Delineate Protein Structural Domains from Sequence Data , 2022 .

[6] Osamu Ohara,et al. DomCut: prediction of inter-domain linker regions in amino acid sequences , 2003, Bioinform..

[7] Frederick Mosteller,et al. Applied Bayesian and classical inference : the case of the Federalist papers , 1984 .

[8] Jérôme Gracy,et al. Automated protein sequence database classification. II. Delineation Of domain boundaries from sequence similarities , 1998, Bioinform..

[9] Lyal B. Harris. November , 1890, The Hospital.

[10] Stephen H. Bryant,et al. Domain size distributions can predict domain boundaries , 2000, Bioinform..